Skip to content

Conversation

@maharajamihir
Copy link
Contributor

  • add_import_after_use.md - User writes code using json.loads(), model suggests navigating to file top and adding import json
  • binary_toggle.md - User writes dark theme if block, types else {, model completes with opposite light theme values (sun.png vs moon.png)
  • comment_momentum.md - User copies Transformer forward pass, edits to pre-layernorm, starts commenting out old version; model continues the commenting pattern
  • cross_file_navigation.md - User adds parameter to function call in main.py, model suggests jumping to services.py to update function definition
  • cross_file_python.md - User writes code in main.py using functions from utils.py; model completes normalize(mat_a) based on cross-file context
  • cross_file_react.md - User pastes Button component, switches to App.jsx, types partial import; model completes import { Button } from './components/Button'
  • debug_typo_fix.md - User runs sbatch job, sees FileNotFoundError for trian_dataset.jsonl, model suggests fixing typo to train_dataset.jsonl
  • env_config_completion.md - User writes os.getenv("STRIPE_W, model completes with EBHOOK_SECRET") based on .env.example
  • fullstack_glue.md - User adds "Premium Status" column in frontend, types partial accessorKey: ", model completes with is_premium" from backend model
  • god_tier_file_context.md - User types report_date = _format in 500-line helpers.py, model completes with _format_iso_date_to_human(created_at) defined ~440 lines above
  • line_complete.md - Tests basic line completion for DataLoader methods (self.load_json(filename), item.get(key) == value], json.dump(..., indent=2))
  • long_horizon_debug.md - User adds breakpoints, runs debugger, renames variables to Shazeer notation (x_BT, x_BTD, logits_BTV), then removes breakpoints
  • overfit_single_batch.md - User comments out training loop, fetches single batch, types while Tr; model completes overfitting loop
  • propagate_field.md - User adds email parameter to init, model propagates to self.email = email, then to user_to_dict, then to instantiation
  • semantic_filter.md - User creates get_available_products function, types available_; model infers [p for p in products if p['in_stock']] from jsonl data

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds 15 new handcrafted test cases for evaluating code completion model capabilities across diverse scenarios including cross-file navigation, semantic understanding, debugging workflows, and pattern recognition.

Changes:

  • Added 15 markdown test case files testing various code completion scenarios
  • Each test includes bash commands, code snippets, and assertion validation
  • Tests cover Python, JavaScript/React, and configuration file scenarios

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated no comments.

Show a summary per file
File Description
semantic_filter.md Tests model ability to infer list comprehension filter based on JSONL data context
propagate_field.md Tests field propagation across init, dict conversion, and instantiation
overfit_single_batch.md Tests completion of overfitting loop pattern for ML debugging
long_horizon_debug.md Tests variable renaming to Shazeer notation and breakpoint cleanup
line_complete.md Tests basic line completion for DataLoader methods
god_tier_file_context.md Tests long-range context usage (~440 lines) for function completion
fullstack_glue.md Tests cross-file field name matching between backend and frontend
env_config_completion.md Tests environment variable name completion from .env.example
debug_typo_fix.md Tests typo identification and correction in file paths
cross_file_react.md Tests React component import completion with correct path
cross_file_python.md Tests Python function call completion using cross-file context
cross_file_navigation.md Tests navigation and parameter addition across files
comment_momentum.md Tests pattern recognition for continuing comment blocks
binary_toggle.md Tests completion with opposite/toggle values (dark/light theme)
add_import_after_use.md Tests adding missing import after detecting usage

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants