algorithm - How to calculate difficulty metric? -

- January 15, 2014

note: have changed original question!

i have several texts, consists of several words. words categorized difficulty categories 1 6, 1 being easiest 1 , 6 hardest (or common least common). however, not words can put these categories, because countless words in english language.

each category has twice many words category before.

level: 100 words in total (100 new)
level: 200 words in total (100 new)
level: 400 words in total (200 new)
level: 800 words in total (400 new)
level: 1600 words in total (800 new)
level: 3200 words in total (1600 new)

when use term level 6 below, mean introduced in level 6. part of 1600 new words , can't found in 1600 words level 5.

how rate difficulty of individual text? compare these texts:

an easy one

would consist of basic vocabulary:

i drive car.

let's these 4 level 1 words.

a medium one

this old man cretinous.

this basic sentence comes 1 difficult word.

a hard one

would have advanced vocabulary in there too:

i steer gas guzzler.

so how more difficult second or third of first one? let's compare text 1 , text 3. i , a still level 1 words, gas might lvl 2, steer 4 , guzzler not in list. cretinous level 6. how calculate difficulty of these texts, i've classified vocabulary?

i hope more clear want now.

the problem trying solve how quantify qualitative data.

the search term "quantifying qualitative data" may you.

there no general all-purpose algorithm this. best way depend upon want use metric for, , ratings of each individual task mean project whole in terms of practical impact on factors interested in.

for example if hardest tasks typically unsolvable, project involves single type 6 task, project may become unsolvable, , metric need reflect this.

you need find way address missing data (unrated tasks). it's single numeric metric not going capture information want these projects.

once have understood metric used for, , how task ratings relate each other (linear increasing difficulty vs. categorical distinctions) there plenty of simple metrics may codify analysis.

for example, may rate projects risk based on combination of number of unknown tasks , number of tasks difficulty above threshold. alternatively may rate projects duration based on weighted sum of task difficulty, using default or estimated difficulty unknown tasks.

Search This Blog

Sher