There is some evaluation code we should extract to the new RAG package to allow for efficient testing.