2. Intervali zaupanja za test Mann-Whitney
- Author
Manevski, Damjan and Pohar Perme, Maja
- Subjects
velikost učinka ,confidence interval ,interval zaupanja ,effect size ,probabilistic index ,krivulja ROC ,small sample size ,majhne vzorce ,area under the ROC curve ,Mann-Whitney ,verjetnostni indeks - Abstract
Test Mann-Whitney se pogosto uporabi kot neparametrična alternativa testa t za dva vzorca. Čeprav ga pogosto srečamo v praksi, redko najdemo primere, v katerih so rezultati testa pospremljeni z ustreznim intervalom zaupanja. Poleg tega pa tudi ne srečamo enotnega poročanja intervalov zaupanja, saj so ti pogosto izračunani za razliko median ali za premik (ang. shift) med porazdelitvama. Ker ti meri nista neposredno povezani s testno statistiko testa Mann-Whitney, se lahko zgodi, da je interpretacija rezultatov testa pomembno različna od interpretacije intervalov zaupanja. V magistrskem delu se bomo osredotočili na verjetnost, da je slučajna spremenljivka X manjša ali enaka kot slučajna spremenljivka Y. Cenilka te mere je v bijektivni povezavi s testno statistiko Mann-Whitney in mero v literaturi pogosto srečamo kot stopnjo prekrivanja ali verjetnostni indeks. Ta mera je tudi enaka meri AUC (ang. area under the ROC curve). V literaturi smo že srečali nekaj različnih metod za izračun intervalov zaupanja te mere. V tem delu pa bomo obravnavali najbolj obetavne metode izmed teh in predstavili njihove ideje. Analizirali bomo lastnosti različnih cenilk za varianco in raziskali problem majhnih vzorcev pri konstrukciji intervalov zaupanja. Poleg tega bomo izpostavili primere, kadar trenutne metode za izračun intervalov zaupanja vrnejo neustrezne verjetnosti pokritja. V delu ugotovimo, da je cenilka DeLonga zanesljiva ne glede na porazdelitev obravnavanih vzorcev. Pri tem pa je bolj smiselno izračunati interval zaupanja na logit skali, s čimer hkrati odpravimo nekaj težav prvotne metode. Vseeno potrebujemo popravek intervala zaupanja za primer, ko so vse vrednosti enega vzorca manjše kot vse vrednosti drugega. Za ta namen smo predlagali popravek, ki v tem primeru da bistveno boljši interval zaupanja. The Mann-Whitney test is a commonly used non-parametric alternative of the t-test. Despite its frequent use, it is only rarely accompanied with confidence intervals of an effect size. If reported, the effect size is usually measured with the difference of medians or the shift of the two distribution locations. Neither of these two measures directly coincides with the test statistic of the Mann-Whitney test, so the interpretation of the test results and the confidence intervals may be importantly different. In this paper, we focus on the probability that the value of the random variable X is lower than the random variable Y. The measure's estimator is in a one-to-one relationship with the Mann-Whitney test statistic and the measure itself is often referred to as the degree of overlap or the probabilistic index. It equals the area under the ROC curve. Several methods have been proposed for the construction of the confidence interval for this measure, we review the most promising ones and explain their ideas. We study the properties of the different variance estimators and the small sample problems of the confidence intervals construction. We identify scenarios in which the existing approaches yield inadequate coverage probabilities. We conclude that the DeLong variance estimator is a reliable option regardless of the scenario, but the intervals should be constructed using the logit scale to avoid values above 1 or below 0 and the poor coverage probability that follows. A correction is needed for the case when all values from one group are smaller than the values of the other. We propose a method that improves the coverage probability also in these cases.
- Published
- 2018
3. Povezanost gibalne učinkovitosti otrok s spolom, starostjo in okoljskimi dejavniki
- Author
Matejek, Črtomir and Starc, Gregor
- Subjects
udc:796.012.1-053.3:316.34 ,velikost učinka ,okolje ,physical fitness development ,effect size ,razvoj gibalnih sposobnosti ,večfaktorska analiza ,environment ,factorial analysis - Abstract
The main aim of the research was to determine how children’s physical fitness development is related to age, gender, and certain environmental factors at the onset of puberty. The research was carried out on a representative sample of 897 children (47.9 % females and 52.1 % males) aged eleven and fourteen. Twelve tests were used to assess their physical fitness. Based on the duration of the physical activities, the children were divided into four categories: inactive, occasionally active, active and highly active. In the case of paternal education and maternal education, the children were classified into three categories: low, average and high. Considering their school grades in mathematics, the children were divided into three groups: less successful, successful and very successful. In the case of their place of residence, the children were divided into three groups: urban, suburban and rural. A component model of factor analysis was used to identify their basic coordinate system of physical fitness. To solve the problem of the integration of physical fitness into environmental factors, age and gender, a factorial analysis of variance was used. The results show that most of the differences in physical fitness can mainly be explained through age and gender. We can conclude that the significant factors to physical fitness development are growth, development and the maturation rate of individuals, which are predominantly hereditarily determined. Place of residence, physical activity, school grades and parental education have less influence on physical fitness development and serve only as an additional impulse to further stimulate or inhibit the physical development of children. Osnovni cilj raziskave je bil ugotoviti, kako je gibalna učinkovitost otrok povezana s starostjo in spolom ter z nekaterimi okoljskimi dejavniki na začetku pubertetnega obdobja. Raziskava je bila opravljena na reprezentativnem vzorcu 897 otrok starih enajst in štirinajst let. Za oceno gibalnih sposobnosti je bilo uporabljenih 12 testov. Merjenci so bili na osnovi časa, ki ga namenjajo športni dejavnosti, razdeljeni v štiri kategorije: nedejavne, občasno dejavne, dejavne in zelo dejavne. Na osnovi izobrazbe očeta in matere so bili merjenci razvrščeni v tri kategorije: nižja, srednja in višja. Na temelju ocene pri matematiki so bili merjenci razdeljeni v tri skupine: slabši in povprečni, dobri, zelo dobri. Na osnovi kraja bivanja so bili merjenci razdeljeni v tri skupine: mestno, primestno in podeželsko. S komponentnim modelom faktorske analize smo v prostoru gibalnih sposobnosti opredelili bazični koordinatni sistem. Za reševanje problema povezanosti gibalnih sposobnosti z okoljskimi dejavniki, starostjo in spolom je bila uporabljena večfaktorska analiza variance. Glede na dejstvo, da največji del razlik v gibalnih sposobnostih merjencev pojasnjujeta starost in spol, lahko ugotovimo, da so ključni dejavniki razvoja gibalnih sposobnosti rast, razvoj in hitrost zorenja posameznika, ki so pretežno dedno determinirani. Kraj bivanja, športna dejavnost, učna uspešnost in izobrazba staršev imajo na gibalno učinkovitost bistveno manjši vpliv in predstavljajo le dodatni impulz, ki dodatno spodbudi ali zavira gibalno učinkovitost otrok.
- Published
- 2017
4. Mjerenje veličine učinka pri proizvodnji funkcionalnog mlijeka
- Author
Pažek, Karmen, Rozman, Črtomir, Turk, Jernej, Majkovič, Darja, Hari, Sebjan, Kolenko, Matej, Pamič, Sašo, and Prišenk, Jernej
- Subjects
functional food ,milk ,mleko ,velikost učinka ,effect size ,Cohen`s-d index ,funkcionalna hrana ,indeks Cohen-d ,udc:631.1:637.1 - Abstract
The paper presents the application possibility of "Effect size" and Cohen's-d index in the case of introduction of new milk products on the market. The field and online survey were used to establish the potential interest of finalconsumers for new functional food product of dairy company in Slovenia - milk with phytosterols additives. Two techniques possibilities of Cohen-d index were calculated manual and using the Cohen's-d calculator. Further, the application is focused on two main questions in survey regarding observed problem: 1) Would you buy milk with phytosterols additives, which scientifically proven lowers concentration of cholesterol in blood? 2) Would you pay for it at a higher price? The sample includes 419 surveys, 150 surveyswere conducted on field (control group) and 269 surveys were provided online (experimental group). The Cohenćs-d index (d) results show by using manual and Cohen's-d calculator for both groups "small" effect (d=0.35, i.e. d=0.34 ), and "zero or near zero" effect (d=0.15, i.e.= 0.15) when deciding to buy new milk product. U ovom radu prikazan je primjer mogućosti primjene “Utjecaja veličine” i Cohen-d indeksa u slučaju plasiranja novog mliječnog proizvoda na tržište. Terenska i online anketa korištene su za ocjenjivanje potencijalnog interesa potrošača za kupnju novog, funkcionalnog mliječnog proizvoda u Sloveniji - mlijeko s aditivom fitosterolom. Korišten je izračun za dvije vrste Cohen-d indeksa, ručno i pomoću Cohen’s-d kalkulatora na primjeru dvaju glavnih pitanja: 1) Zainteresirani ste za kupnju mlijeka s aditivom fitosterola, koji znanstveno dokazuje sniženje koncentracije kolesterola u krvi i 2) Spremni ste platiti za taj proizvod veću cijenu? Uzorak obuhvaća 419 anketa, od toga provedeno je 150 anketa na terenu (kontrolna skupina), dok je 269 anketa provedeno online (eksperimentalna skupina). Cohen-d indeks (d) rezultati prikazani su za dva prije spomenuta načina izračuna, “mali” učinak (d=0,35, odnosno d=0,34) i “nula ili blizu nule” učinak (d=0,15, odnosno = 0,15).
- Published
- 2017
5. Analysis of the results of agricultural holdings in Slovenia processed according to FADN methodology
- Author
Trpin Švikart, Darija and Pažek, Karmen
- Subjects
velikost učinka ,effect size ,sampler ,signifikantnost ,ekonomika ,vzorčniki ,economics ,significance ,grant recipient ,FADN ,obvezniki - Abstract
Na osnovi baze podatkov, ki jih zagotavlja mreža knjigovodskih podatkov s kmetij (ang. kratica FADN) smo preučevali medsebojne korelacije med izbranimi spremenljivkami. Moč korelacij med posameznimi spremenljivkami (število polnih delovnih moči, obseg kmetijskih zemljišč v uporabi, obseg skupnega prihodka, bilanca tekočih subvencij in davkov, dohodek kmečke družine in bruto dodana vrednost) smo preučevali s pomočjo Spearmanovega koeficienta. Ugotovili smo potrditev pričakovanega, in sicer, da višanje vrednosti izbranih spremenljivk vpliva na višjo vrednost odvisne spremenljivke (bruto dodane vrednosti). Signifikantno korelacijo z najnižjo vrednostjo Spearmanovega koeficienta sta izkazali spremenljivki število polnih delovnih moči in obseg kmetijskih zemljišč, kar je nekoliko presenetljivo, saj bi pričakovali, da obe spremenljivki igrata pomembno vlogo pri vplivu na odvisno spremenljivko. Tako, da bi bila natančnejša preučitev teh dveh spremenljivk lahko izziv za nadaljnje raziskave tudi zaradi očitka Evropske unije (EU), da je v Sloveniji preveč polnih delovnih moči glede na efektivnost kmetijskih gospodarstev. V nadaljevanju smo za potrditev zgoraj zapisanega, in sicer, da neodvisne spremenljivke (višanje njihove vrednosti) vplivajo na odvisno spremenljivko (bruto dodano vrednost), izvedli robustno regresijo in metodo najmanjših kvadratov. Slednja se je izkazala za manj primerno, zaradi prisotnosti osamelcev, ki jih vsebuje baza FADN podatkov. Istočasno je izračunana visoka vrednost popravljenega determinacijskega koeficienta (0,97 in več), potrdila, da je bila izbira metode robustne regresije s cenilko MM pravilna. V magistrskem delu smo delali z dvema skupinama, vzorčna kmetijska gospodarstva, kot kontrolna skupina, ker je deležna strožjih logičnih kontrol, in obvezniki, kot testna skupina. Da smo lahko na teh dveh skupinah izračunali Cohenov d indeks, smo predhodno izračunali nekatere mere opisne statistike. V več kot polovici primerov se porazdelitvi obravnavanih spremenljivk testne in kontrolne skupine skoraj v celoti prekrivata, in tako smo delno potrdilo drugo hipotezo. On the basis of the data provided by the network of the Farm Accountancy Data (Eng. Abbreviation FADN), we studied the mutual correlations between selected variables. The strength of correlations between variables (the number of full working power, the extent of agricultural land in use, the volume of total revenue, the balance of current subsidies and taxes, income farm families and gross value added) was studied using Spearman's coefficient. We manage to confirm expected, namely that increasing the value of selected variables affect the higher value of the dependent variable (gross value added). The lowest value of Spearman's coefficient have proved variable number of full working strength and variable extent of agricultural land, that’s surprising since it would be expected that both variables play an important role in influencing the dependent variable. Detailed examination of these two variables can be a challenge for further research, due to the complaint of the European Union (EU) that in Slovenia is too high number of full working power/farm, regarding the efficiency of agricultural holdings. For confirmation written above, namely, that the independent variables (increasing its value) affect the dependent variable (gross value added), we conducted robust regression and least squares method. Least squares method has proved to be less suitable, due to the presence of outliers contained in the FADN database. At the same time it was calculated high value corrected determination coefficient (0.97 and above), and that confirmed, that the choice of method robust regression estimator with MM was correct. In this thesis, we worked with two groups of farms, “sampler” as a control group because it receives more stringent logical controls, and holdings that receives grants as a test group. That we can in these two groups calculated Cohen's d index, we previously calculated extent descriptive statistics. In more than half of the cases, the distribution of the variables under consideration test and control groups, almost completely overlap and that partially confirmed the second hypothesis.
- Published
- 2016
6. Optional modelling and Effect size measurement for decision support in agrifood production
- Author
Kolenko, Matej and Pažek, Karmen
- Subjects
real options ,velikost učinka ,Black-Scholes model ,effect size ,CBA analysis ,onion processing ,binomski model ,binomial model ,CBA analiza ,predelava čebule ,Cohen d index ,Cohen d indeks ,Black-Scholesov model ,realne opcije - Abstract
Osnovni namen raziskave je bil prikaz aplikacije metod realnih opcij v kmetijstvu in kot pomembno dopolnilo k temu merjenje velikosti učinka s Cohenovim d indeksom. Rezultati tradicionalne CBA s parametrom (NSVt) so pokazali, da je najsmotrnejša proizvodnja posušene čebule (NSVt = 518.066 €). Za oceno realnih opcij investicijskih projektov predelave čebule sta bila nadalje uporabljena Black-Scholesov model in binomski model. Opcijske vrednosti (OV) tako pri Black-Scholesovem (BS) modelu kot tudi pri binomskem modelu kažejo pozitivne vrednosti za vse tri vrste predelave čebule (zamrznjena čebula, čebulni obročki, posušena čebula). Kot najbolj smotrna investicija se je izkazala predelava čebule v posušeno čebulo. Vrednost opcije po Black-Scholes metodi je znašala 39.798 € in pri binomskem modelu 77.561 €. Realne opcije imajo pomembno vrednost, saj se tradicionalne metode naložbene analize tako nadgradijo tudi z metodami, ki upoštevajo realne opcije tudi s stohastičnega vidika. S pomočjo Cohenovega d indeksa smo merili velikost učinka med spletno in terensko anketo pri obravnavanih vprašanjih. Ugotovitve kažejo, da je pri vseh vprašanih 76 % rezultatov med obema skupinama identičnih, kar pomeni, da imamo opravka z majhnim učinkom, 20 % ima srednje velik učinek, 4 % predstavljajo velik učinek. The primary purpose of the study was to show the application of methods of real options in agriculture and as an important addendum to this, the measurement of effect size with Cohen's d index. Results of traditional CBA with the parameter (NPVt) have shown that the production of dried onion is the most expedient (NPVt = € 518,066). In order to assess the real options of investment projects of onion processing, the Black-Scholes model and the binomial model were used further on. Optional values (OV) in both the Black-Scholes (BS) model as well as in the binomial model, show positive values for all three types of processing onions (frozen onions, onion rings, dried onion). Processing of onion into dried onion seemed to be the most expedient investment. The value of option using the Black-Scholes method amounted to € 39,798 and € 77,561 for binomial model. Real options have a significant value as the traditional methods of investment analysis are upgraded with the methods that take into account real options from the stochastic perspective. With the help of Cohen's d index we measured the effect size between the on-line and field survey in the issues discussed. The findings show that the 76% of interviewees' results between the two groups are identical, which means that we are dealing with a small effect, 20% have a medium-sized effect, 4% represent a large effect.
- Published
- 2015
