2024-10-31

Reminders

  • There is only one point.

  • Patience is the capacity to welcome difficulty when it comes, with a spirit of strength, endurance, forbearance, and dignity rather than fear, anxiety, and avoidance.

  • Don't go so fast: rushing is the root of every evil.

  • Come back to basics: vows and practice, relax, and have fun.

  • Pay 100% attention to the task at hand. Give it all.

Focus of the day

  • Assistant code rewrite

TIL

Eval front end

  • The optimization idea is from semi-supervised learning

docetl

  • Very useful abstractions

Hamel's eval writeup

Hamel's takeaways:

  • Remove ALL friction from looking at data.
  • Keep it simple. Don’t buy fancy LLM tools. Use what you have first.
  • You are doing it wrong if you aren’t looking at lots of data.
  • Don’t rely on generic evaluation frameworks to measure the quality of your AI. Instead, create an evaluation system specific to your problem.
  • Write lots of tests and frequently update them.
  • LLMs can be used to unblock the creation of an eval system. Examples include using a LLM to:
    • Generate test cases and write assertions
    • Generate synthetic data
    • Critique and label data etc.
  • Re-use your eval infrastructure for debugging and fine-tuning.

My own takeaways:

  • Domain expert
  • Test Data Coverage: features, scenarios, edge cases
  • Try langsmith for traces

Review

  • Pretty good day
  • Learned a few new things, good conversation about refactoring the assistant codebase