跳到主要内容

PAL-Math

This dataset refers to the Deepseek Coder's PAL-Math implementation, which enables LLMs to solve math problems with python code. Seven datasets are included: GSM8k, MATH, GSM-Hard, SVAMP, TabMWP, ASDiv, and MAWPS.

Subset Selection

Configuration

FieldValueDescription
compile_timeout
Compilation timeout in seconds
run_timeout
Execution timeout in seconds

Usage

from datasets import load_dataset
import requests

config = {
'compile_timeout': 20,
'run_timeout': 20,
'dataset_type': "PalMathDataset"
}

# Get dataset data in sandbox format
data = list(load_dataset("sine/FusedPALMath", "asdiv", split="test"))

config['provided_data'] = data
prompts = requests.post('http://localhost:8080/get_prompts', json={
'dataset': 'palmath',
'config': config
}).json()

print('please perform model inference on these prompts:')
print('\n'.join([p['prompt'] for p in prompts[:3]]))
print('...')

# your model inference code here
completions = ['' for _ in prompts]

for completion, sample in zip(completions, data):
config['provided_data'] = sample
res = requests.post('http://localhost:8080/submit', json={
'dataset': 'palmath',
'id': '',
'completion': completion,
'config': config
})

print(f'result: {res.json()}')
break

Note: always put raw completion in the request, Sandbox will handle the extraction of code according to different modes.