See PyETR to LLM: What Question to Ask? for more details.

Convert Questions

Convert existing cases into a format that can be easily run through an LLM engine.

Data

See https://github.com/dreamingspires/PyETR/blob/master/pyetr/cases.py for a list of cases.

GPT3logs.docx

Generated Questions

See https://github.com/Oxford-HAI-Lab/PyETR/blob/master/lm_eval/data_generation/random_logical/generated/generated_cases_medium.py

Running Evals

Create a harness for running questions through LLMs. See ‣ for details about that, but the upshot is that I’m going to use LM Evaluation Harness, which should do what we want.

High Level Desiderata

What Type of Question

See PyETR to LLM: What Question to Ask?

High Level Goals

Data Collection

Difficulty