A tool to automate testing and comparing ChatGPT responses based on variations in prompts and configurations
- Built tool to test performance of different OpenAI API use cases, by comparing Embeddings, Completions, and Assistants each with different prompt structures.
- Created front end visualization table to filter and compare results.
- Wrote python scripts to build PKL file of embeddings from data sets of product descriptions and metadata, provided as CSV, JSON, HTML or raw text.
- Created result schema to map AI recommendations back to prompt speeds, configurations and embedded vector stores in order to fine tune results with confidence.