Attribution  /  Informative set of parameters

An experiment was performed at describing the a priori classes on the language of parameters from the a priori dictionary of parameters in order to determine the informative set of parameters.

Two random selections with a volume of 100 sentences each were made from two a priori classes. A volume of 100 sentences is sufficient for estimative selections which are made to determine the order of dispersion value. In mathematical statictics it is recommended to make no less than 30 measurements (Doerffel, 1990). In the research the real volumes of the necessary selection is determined using formulas taking into account a standard deviation and a volume of general totality of texts (formula 2).

The results of the experiment were presented in two object-sign data matrixes with a dimensionality nxN=100x51, where n is the number of parameters, and N is the number of objects.

When forming the set of informative parameters M.M. Bongardís scheme was used, which calls for two-step reduction of the parametric space (ŃÓŪ„ŗūš, 1967). In the first stage the a priori set of informative parameters is broken down into two subsets of parameters, which are relevant and not relevant for distinguishing a priori classes. The relevance of parameters for distinguishing the two a priori classes is determined using the t-Student criteria. In the research ,  are not known, but only estimated, and the threshold valuation at α=0,05 of the t-Student criteria is approximately 1.96.

 

Calculation of the t-Student criteria.

 

parameter

       a priori classes

Ω1(Corneille)

Ω2(Quinault)

 

 

n

 

 

n

 

1

2

3

4

5

6

7

8

X1

3,28

4,92

100

2,84

4,19

100

0,68

X2

1,80

0,89

100

2,17

1,41

100

2,22

X3

0,44

0,56

100

0,38

0,65

100

0,70

X4

0,53

0,94

100

1,22

1,43

100

4,04

X5

0,04

0,20

100

0,02

0,14

100

0,83

X6

0,52

0,75

100

0,44

0,78

100

0,74

X7

0,50

0,70

100

0,44

0,78

100

0,57

X8

0,02

0,14

100

0,00

0,00

100

1,42

X9

0,00

0,00

100

0,00

0,00

100

ó

X10

0,00

0,00

100

0,00

0,00

100

ó

X11

1,41

0,91

100

1,65

1,10

100

1,68

X12

0,04

0,20

100

0,02

0,14

100

0,83

X13

0,06

0,28

100

0,08

0,34

100

0,46

X14

0,05

0,22

100

0,07

0,29

100

0,56

X15

9,61

5,28

100

10,50

7,29

100

0,99

X16

3,58

2,94

100

3,46

3,14

100

0,28

X17

2,38

1,95

100

2,31

2,00

100

0,25

X18

1,36

1,44

100

1,38

1,50

100

0,10

X19

2,28

1,89

100

2,42

1,99

100

0,51

X20

0,04

0,20

100

0,06

0,28

100

0,59

X21

1,76

1,01

100

2,14

1,52

100

2,09

X22

0,77

1,06

100

0,97

1,15

100

1,28

X23

0,93

0,98

100

1,04

1,13

100

0,74

X24

1,49

1,45

100

1,17

1,30

100

1,64

X25

0,96

1,16

100

0,79

1,01

100

1,10

X26

0,42

0,79

100

0,32

0,63

100

0,98

X27

0,52

0,69

100

0,47

0,67

100

0,52

X28

0,15

0,41

100

0,28

0,65

100

1,69

X29

1,27

1,06

100

1,13

1,30

100

0,83

X30

1,40

1,25

100

1,11

1,35

100

1,58

X31

1,47

0,98

100

1,91

1,46

100

2,50

X32

1,08

0,95

100

1,39

1,05

100

2,19

X33

0,37

0,85

100

0,34

0,87

100

0,25

X34

1,12

2,85

100

1,08

2,91

100

0,10

X35

0,14

0,51

100

0,17

0,59

100

0,38

X36

0,15

0,56

100

0,07

0,41

100

1,16

X37

0,10

0,39

100

0,07

0,26

100

0,64

X38

0,43

1,85

100

0,36

1,43

100

0,30

X39

0,02

0,14

100

0,00

0,00

100

1,42

X40

0,02

0,20

100

0,00

0,00

100

1,00

X41

1,37

1,45

100

1,31

1,43

100

0,29

X42

0,00

0,00

100

0,02

0,14

100

1,42

X43

0,42

0,78

100

0,45

0,73

100

0,28

X44

0,20

0,55

100

0,23

0,55

100

0,39

X45

0,34

0,57

100

0,35

0,58

100

0,12

X46

1,00

2,14

100

0,75

1,47

100

0,96

X47

1,08

1,06

100

0,93

1,08

100

0,99

X48

1,21

1,34

100

1,26

1,33

100

0,26

X49

3,44

3,77

100

3,64

4,42

100

0,34

X50

2,83

2,98

100

2,93

3,20

100

0,23

X51

0,61

1,21

100

0,71

1,57

100

0,50

 

 

The values of the t-criteria for five parameters turned out to be higher than the critical level, which made it possible to determine the parameters X02 (the number of simple sentences), X04 (the number of complex sentences), X21 (the number of conjugated forms of the verb), X31 (the number of subjects), and X32 (the number of pronouns-subjects) as informative.

The second stage of M.M. Bongardís scheme calls for a procedure of reducing the parametric space into a subset of informative parameters.

The calculation results showed that for all five parameters < 1, from which one can conclude that on the second stage of selection of informative parameters there was no further reduction in the number of informative parameters, and that the informative set of parameters included the five parameters received during the first stage.

Related Websites:
Reseachers
Science Group
Photo Gallery
St. Petersburg State University
©2009-2011 All copyright, trade marks, design rights, patents and other intellectual property rights (registered and unregistered) in and on corneille-moliere.com and all content located on the site shall remain vested in site authors. You may not copy, reproduce, republish, post, broadcast, transmit, make available to the public, or otherwise use.
©2009-2011 Generatum Ltd.