forked from kachok/hitman
-
Notifications
You must be signed in to change notification settings - Fork 1
epavlick/hitman
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Web - template and web app for creating and running Mechanical Turk HITs as External Questions
Backend - command line tools for interacting with MTurk API and creating HITs in MTurk
RUN ONCE:
load_data_to_db.py
populates langugages table in database
loads data from wikilanguages-pipeline into database to vocabulary and dictionary tables respectively
prepare_images_files.py
generates images descriptor files for all vocabulary and dictionary items
java -Djava.awt.headless=true generateimages sentences.txt segments.txt images
#java -Djava.awt.headless=true generateimages test_sentences.txt test_segments.txt ../test_images
#java -Djava.awt.headless=true generateimages pilot_sentences.txt pilot_segments.txt ../pilot_images
renders PNG files
10MB of images per 1000 words+sentences
obama4australia.py
generates images for Barack Obama and Australia in all languages
#java -Djava.awt.headless=true Str2ImgObama support_sentences.txt support_sequence.txt ../pilot_images/support
generate_voc_hits.py
creates hittypes in MTurk and populates hittypes table in database
generates batches of vocabulary HITs in database based on vocabulary data
populates hits table and voc_hits_data tables in database
add_voc_hits_to_mturk.py
generates HITs in MTurk and reference them to batches of words in database
creates HITs in MTurk for every HIT in datbase with empty mturk_hit_id column
updates hits table with non-empty mturk_hit_id value
generate_synonyms_table.py
generates list of synonyms to be used in Synonyms HIT as controls
generate_non_synonyms_table.py
generates list of non-synonyms to be used in Synonyms HIT as controls
RUN CONTINUOUSLY:
get_assignments.py
tasks retreives all completed assignments from all HITs and loads results into proper hits_results tables
(works both for vocabulary and synonyms HITs)
generate_syn_hits.py
takes processed results from Voc HIT (from vochitsresults table) and populates syn hits and syn hits data table
task can be run multiple times without creating duplicates
add_syn_hits_to_mturk.py
generates HITs in MTurk and reference them to batches in database
creates HITs in MTurk for every HIT in datbase with empty mturk_hit_id column
updates hits table with non-empty mturk_hit_id value
get_assignments.py
-- run again to get results of new syn hits
revew_step2.py
reviews results of step2 (synonyms HITs)
grade syn hits results (based on controls)
select all assignments pending update in MTurk
for each pending assignments
grade assignments one-by-one
update status in Mturk and create extra assignment for related hit if assignments data was bad (data_quality<=50%)
close assignment in database
review_step1.py
select all syn hits that have 3 assignments completed with data_status>50% each
CREATE OR REPLACE VIEW syn_hits_completed AS
-- select all syn HITs that have 3 assignments with good quality of data. Those HITs considered completed
-- and data can be pushed to Step 1 to Vocabulary HITs/Assignments
for each hit
get 3 assignment above (syn_hits_completed x assignments (by hit_id))
weight results for each word translation and multiply it by performance of worker
-- weight+result per hit per assignment
SYN_HITS_RESULTS_WIGHTED_PAIRS
select voc_assignment_id, pair_id, are_synonyms, max(weight)
from
(
select voc_assignment_id, pair_id, are_synonyms, avg(worker_performance) as weight from (
select shd.voc_assignment_id, a.hit_id, a.id as assignment_id,shr.quality,shr.pair_id, shwp.quality as worker_performance, shr.are_synonyms from assignments a, syn_hits_completed shc, syn_hits_results shr, syn_hits_data shd, syn_hits_workers_performance shwp
where
shd.hit_id=a.hit_id
and
shd.id=shr.pair_id
and
a.hit_id=shc.id
and
shr.assignment_id=a.id
and
shr.is_control=0 -- remove controls (don't need them)
and
a.worker_id=shwp.id
and
a.data_status>0.5 -- cut off nonperforming assignments
) as r
group by voc_assignment_id, pair_id, are_synonyms
) as rr
group by voc_assignment_id, pair_id, are_synonyms
sort result by weight?
-- backtracking database from step2 to step1
select * from hits
select * from assignments
select * from vocabulary
select * from dictionary
select * from voc_hits_data
where hit_id in (2,3)
select * from voc_hits_results
where assignment_id=3
select * from voc_hits_results
where assignment_id=3
and is_control=0
select * from assignments
where id in (11,12)
select * from syn_hits
-- hit_id 41/42
select * from syn_hits_data
where id in (11,12)
-- hit_id=42
select * from syn_hits_data
where is_control=0
select * from syn_hits_results
-- voc assignments that are completed
select * from assignments
where id in (8,4)
-- hit_id 36/29 for assg_id 8/4
select * from voc_hits_data
where id in (36,29)
select * from syn_hits sh, assignments a, workers w, syn_hits_data shd
where sh.id=a.hit_id
and a.worker_id=w.id
and
shd.is_control=0
and
shd.hit_id=sh.id
-- trace from voc_assignment_id to
select * from assignments a, hits h
where
a.hit_id=h.id
and
a.id=6
select * from voc_hits_results
select * from voc_hits_data
select * from assignments
select * from syn_hits_data
*review-step2.py
review results of step2
HOW IT WORKS:
for all assignments:
+review if passed/failed based on control
+mark assignments as passed/failed
+go to MTurk and accept/reject assignments as needed
+add extra assignment for each rejected assignment to corresponding HITs
for all HITs:
+select all HITS if # of accepted assignments=3
-calculate passed/failed translations for each HIT for each word_id
select * from synonymshits h, synonymshitassignments a, synonymshitresults r, synonymshitsdata d
where a.mturk_hit_id=h.mturk_hit_id
and
r.assignment_id=a.id
and
r.pair_id=d.id
-- select words and 1/0 for correct/incorrect (need to be grouped)
select word_id, d.assignment_id,
case WHEN are_synonyms='no' THEN 0
else 1
end as correct
from synonymshits h, synonymshitassignments a, synonymshitresults r, synonymshitsdata d
where a.mturk_hit_id=h.mturk_hit_id
and
r.assignment_id=a.id
and
r.pair_id=d.id
and
d.is_control=0
-push fail/pass status to step 1
do something with all the QA data and pass it to step 1
*review-step1.py
review results of step1
HOW IT WORKS:
for all assignments:
-review if passed/failed based on results pushed from step2
-mark assignments as passed/failed
-go to MTurk and accept/reject assignments as needed
-add extra assignment for each rejected assignment to corresponding HITs
for all HITs:
-select all HITS if # of accepted assignments=3
-calculate passed/failed translations for each HIT for each word_id
-build dictionary
-tada!
---
tasks estimates
100 langs
10,000 words
1250 HITS per language
125,000 1 step HITs
375,000 1 step Assignments
375,000*2 synonym pair
750,000 pairs
100,000 step 2 HITS? (4 times less then step 1)
300,000 assignments for step 2
----
Rules for grading tasks:
First 10 assignments performed by worker are auto-approved
After that assignments are rated based on scale:
100%-70% approved, results good
70%-50% approved, results are questionable
50%-0% rejected, results are thrown away
give ? for questionable controls, see for majority vote
About
Template for creating and running Mechanical Turk HITs as External Questions
Resources
Stars
Watchers
Forks
Packages 0
No packages published