Conversation
|
Thanks for the PR. As a general comment about benchmarks, I believe it is important for users to have a transparent understanding of what is being benchmarked, and how to add new ones. The other thing is I think we should move with smaller steps focusing on continuity rather than a big new version. Can we solves each of these problems in separate work:
I am not sure if we should have a separate package for all of these, or introducing new packages. I think it is possible to start without committing to a new package and using the current pyproject.toml file for dependency management -- we just need to ensure we don't include the benchmark code in the PyPI release. |
Sure no issue, with the current dataset sourcing I can move it into the |
What does this PR do?
Why do it this way?
TODO:
This is a draft PR that covers the initial step of dataset generation & caching. I still need to work on the following steps but wanted to get some feedback on the approach before investing more time into it:
Metrics + reporting layer
Benchmark runners & Docker harness
CLI quickstart