<<back to publications<< ...................>> to Teach0logy.xyz>>

My name is Valentin Voroshilov, I am an expert in the field of education. I am looking for a support in developing a new type of a tool and a procedure to measure learning outcomes of students taking STEM courses.


The Description of the Project


What can’t be measured - can’t be managed”. This sentiment is widely accepted by the members of a financial and business community. In education, however, the discussion of “accountability”, and “measurability” brings many controversial views and opinions on absolutely all levels. What should not and what should be measured, when, by whom, what conclusions and decisions should be derived from the results? But the most controversial question is “how”? Currently there is no tool for measuring learning outcomes of students, which would be broadly accepted by teachers, schools and district officials, by parents, policymakers. An open to public, consistent with the learned content, and uniform by the procedure and results tool for measuring learning outcomes does not exist. The situation is like all 50 states had different currency, and there were no exchange rates, there was no way to use money from one state in another one.


As the consequence, any data on the quality of teaching is always being adopted by one “camp” and disputed by another one (the latest example is represented in The Boston Globe, on March 19, 2015, in article “Boston’s charter schools show striking gains”).


As a physics teacher I work on developing a tool for measuring learning outcomes of students taking physics, but exactly same approach can be used in any STEM subject.


Many instruments to measure learning outcomes in physics have been proposed and used, including but not limited to: the Force Concept Inventory (Hestenes et al.), the Force and Motion Conceptual Evaluation (Thornton, Sokoloff), the Brief Electricity and Magnetism Assessment (Ding et al.), SAT physics subject test, AP physics exam, MCAT (physics questions). However, all current tools for measuring learning outcomes suffer from at least one, or several deficiencies listed below:


1. the list of concepts depends on the preference of a developer and usually not open to a user (one using an inventory can derive an assumption about the concepts analyzing the questions in the inventory, but that is only an assumption).

2. The set of questions depends on the preference of a developer.

3. The set of questions is limited and cannot be used to extract gradable information on student’s skills and knowledge.

4. The fundamental principles used for development of a measuring tool are not clear or not open to public, teachers, instructors, professionals.

5. The set of questions becomes available for public examination only after being used, hence it only reflects the view of the developers on what students should know and be able to do, there is no general consensus among the users, instructors, and developers on what should be measured and how to interpret the results.

6. The set of questions change after each examination, which makes impossible to compare the results of students taking different tests (public should just rely on the assertion of the developers that the exams are equivalent, having no evidence of that).


I propose that a tool and a procedure used to measure learning outcomes of students taking physics course (or, for that matter, any STEM course) must satisfy the following conditions:


(a)               Every aspect of the development and the use of the tool has to be open to public and be able to be examined by anyone.

(b)               The use of the tool must lead to gradable information on student’s skills and knowledge.

(c)               The use of the tool must lead to gradable information on student’s skills and knowledge which must not depend on any features of the teaching or learning process and which must allow to compare on the uniform bases the learning outcomes of any and all students using the tool.

(d)               Any institution adopting the tool becomes an active member of the community, which can propose possible alternations to accommodate changes in the understanding of what students should know and be able to do.


It might seem impossible to develop a tool, which would satisfy all the conditions above. However, in fact, there is a singular example of the use of a tool like that: without a solid theoretical foundation, but rather as a practical instrument a similar tool for measuring learning outcomes of prospective students had been used at Perm Polytechnic Institute (Russia,  ~ 1994 - 2003).


We can make a reasonable assumption, that after the initial adoption of the tool by participated collaborators (a school, a college), similar approaches will attract attention of officials of different levels, (school principals, district, city, state officials, staff of philanthropic foundations). The demonstration of such a tool and its applicability could have a profound positive effect on the whole system of education, leading to a better comparability, measurability, accountability without having negative effects which have lead to a resistance of the use of standardized tests. Such a tool (or rather a set of tools developed for all STEM subjects and for all levels of learning) would be adopted voluntarily by schools or districts forming a community of active co-developers. Members of the community who adopted the tool would be issuing regular updates, keeping it in agreement with the current understating of what should students know and be able to do.


From the most general point of view, my project falls into a category of research on measurability in education, in particular, in physics. The new tool is to be created using a new theoretical approach. The methodology for development of the tool is based on a “driving exam” approach, which requires - instead of a verbal description of skills and knowledge a student has to display after taking a course (a.c.a. “standards”) – a collection of exercises and actions for which a student has to demonstrate the ability to perform (assuming the full level of learning).


Currently I concentrate my work only on one subject - physics; only on one area – Kinematics, just to prove that the concept works. The project is based on my work, which had been done in Russia before I had to move in the U.S. Currently, there is no theoretical work similar to mine.


I found the project on the use of the scientific method developed in physics and applied to education.


1. I start from the fact that in physics every component of student’s physics knowledge or every skill can be probed by offering to a student to solve a specific problem.


2. For a given level of learning physics there is a set of problems, which can be used to probe student’s knowledge and skills.


3. For a given level of learning physics a set of problems, which can be used to probe student’s knowledge and skills, has a finite number of items.


Hence, it is plausible to assume that to test the level of acquired physics knowledge and skills a collection of problems can be developed which - when solved in full - would demonstrate the top level of achievement in learning physics (an A level of skills and knowledge). A subset of a reasonable size composed from problems selected from the set can be used as a test/exam, which can be offered to a student to collect gradable information on student’s skills and knowledge.


In early ninetieths in Perm Polytechnic Institute a set of about 3000 problems had been developed and used to compose an exam constituted of 20 problems with different difficulty level to be offered to high school graduates applied to become PPI undergraduates. The philosophy behind the approach was simple: it is just impossible to memorize the solutions to all problems in the book (a book was openly sold via PPI book store), and even if there was the one who could memorize all the solutions, PPI would definitely wanted to have this person as a student (FYI, the competition to become a PPI undergraduate was fierce).


The path to the theoretical foundation of the describe approach had been laid out by me in my work in 1996 – 1998. This work had been interrupted due to social circumstances in Russia.


Below is a summary of the publications (my translation from Russian) related to the project.


There are three fundamental principles which represent the basis for the proposed project.


The first principle states that all physics problems can be classified by (a) the set of physics quantities which have to be used to solve a problem; (b) the set of mathematical expressions (which by their nature are either definitions or laws of physics, or derived from them) which have to be used to solve a problem; and (c) the sequence of the steps which have to be acted in order to solve a problem (an example of a general “algorithm” for solving any physics problem is at http://teachology.xyz/general_algorithm.htm).


The second principle states the importance of the new terminology needed to describe and classify different problems (in physics many similar problems use very different wording). All problems which can be solved by applying the exactly same set of quantities (a) and expressions (b) and the same sequence are congruent to each other. Problems which use the same sets (a) and (b) but differ by sequence (c) are analogous problems. Two problems for which set of physics quantities (a) differ by one quantity are similar. Four problems below are offered to illustrate the terminology. Among three problems below problem A is analogous to problem B, and congruent to problem C. Problem D is similar to problems A, B, and C.


Problem A. For a takeoff a plain needs to reach speed of 100 m/s. The engines provide acceleration of 8.33 m/s2. Find the time it takes for the plain to reach the speed.


Problem B. For a takeoff a plain needs to reach speed of 100 m/s. It takes 12 s for the plain to reach this speed. Find acceleration of the plain during its running on the ground.


Problem C. A car starts from rest and reaches the speed of 18 m/s, moving with the constant acceleration of 6 m/s2. Find the time it takes for the car to reach the speed.


Problem D. For a takeoff a plain needs to reach speed of 100 m/s. The engines provide acceleration of 8.33 m/s2. Find the distance it travels before reaching the speed.


It is important to stress, that all congruent problems can be stated using a general language not depending on the actual situation described in a problem. Problem E below is congruent to problems A and C and stated in the most general language not connected to any specific situation.


Problem E. An object starts moving from rest keeping constant acceleration. How much time does it need to reach a given speed?


The third principle states that for every problem a unique visual representation can be assign to it, which reflects the general structure of the connections within the problem. This visual representation is a graph (an entity of a graph theory): each node (vertex) of the graph represents a physical quantity without the explicit use of which a problem cannot be solved; each edge of the graph represents the presence of a specific equation which includes both quantities connected by the edge (link). Each graph represents a specific example of a knowledge mapping, but has to obey to two strict conditions (which makes each graph unique and this approach novel):


1. every quantity represented by a vertex/node of a graph must have a numerical representation, i.e. has to be measurable (capable of being measured).

2. every edge (link) between any to vertices must have an operational representation: i.e. for any quantity represented by a vertex, if its value is getting changed, and the values of all but one other quantities represented by other vertices connected to the changing one are being kept constant, the quantity represented by the remaining vertex linked to the changing one must change its value.


Graphs, which satisfy the above two conditions, represent a novel technique to represent specific scientific content (I call it “a map of operationally connected categories”: MOCC). Knowledge maps usually represent the thought process of an expert in a field solving a particular problem (analyzing a particular situation). MOCCs represent objective connections between physical quantities, which imposed by the laws of nature.


The second condition eliminates possible indirect connections/links (otherwise the structure of a graph would not be fixed by the structure of the problem, but would depend on the preferences of a person drawing the graph). However, even in this cases there is a room for a discussion: should the links/edges represent only definitions and fundamental laws or also expressions derived from them? This question should be answered during the trial use of the developed tool.


All problems which graphs include exactly same set of quantities (a) are called like problems (they compose a set of like problems). A problem which is stated in a general language and which is like to all problems in a set of like problems is the root problem of the set (any specific problem in a set can be seen as a variation of the root problem). The project is based on a proposition that a set of root problems can be used to describe desired level of learning outcomes of students.


One of the consequences of the graphical representation of a problem is the ability to assign to it an objective numerical indicator of its difficulty D = NV + NE, which is equal to the sum of the number of vertices included in the graph NV and the number of unique equations represented by the edges of the graph NE. This indicator can be used for ranking problems by their difficulty when composing a specific test to be offered to a student. This indicator of difficulty does not depend on a perception of a person composing a test and provides uniformity in composing tests.


For example: every specific problem congruent to problem E corresponds to the graph on the right with difficulty D = 5 (four quantities and one equation which relates them).


Each link of the graph on the right represents the same

connection/equation: .

(the graph was drawn using CmapTools:





The first stage of the project is to conduct a survey of major physics textbooks, to rewrite specific problems using a general language, to classify all the problems by their types using their MOCCs, and in the end to compile the collection of unique root problems written in terms of general representation (with examples of specific problems congruent to each other and to the selected unique general problem). The problems will be ranked by the objective difficulty indicator D.


In parallel with the development of the such collection of unique problems written in terms of a general language there will be a computer program developed which can be used for generating exams in accordance with the given specifications, such as the length of the exam (the number of problems), the number of problems for each rubric, sub-rubric, topic, subtopic (i.e. “kinematics of a free fall”, “kinematics of relative motion”, etc.), the number of problems of each difficulty level (within each subtopic). Several versions of the program will be developed; for a desktop computer (Windows and Mac OS), for a web access using a desktop browser (Internet Explorer, Safari, Mozilla Firefox, Chrome), for a web access using a mobile devices (iOS, Android, Windows mobile).


The next stage of the project involves using the developed tools for examining learning outcomes of students taking physics at different institutions. Exam grades based on the use of the developed tool will be compared with the exam grades based on a standard approach.


As the measure of the success of the project I will use the degree of the correlation between the grades obtained using the developed tool and the grades obtained using standard written exams.


I would expect this stage would take up to two years of work and would require up to $300 K to support developing the computer program (a firm has to be hired for this job) and my work on developing and testing the new tool and procedure (assuming that will the only job I will be doing). Broadening the scope of the project (including math, chemistry, stretching to all levels of learning) would require at lest $100 K per a year per a subject over extra 2 to 4 years.




Thornton R.K., Sokoloff D.R. (1998) Assessing student learning of Newton's laws: The Force and Motion Conceptual Evaluation and Evaluation of Active Learning Laboratory and Lecture Curricula. . Amer J Physics 66: 338-352. 


Hallouin I. A., & Hestenes D. Common sense concepts about motion (1985). American Journal of Physics, 53, 1043-1055


Hestenes D., Wells M., Swackhamer G. 1992 Force concept inventory. The Physics Teacher 30: 141-166.


Hestenes D., 1998. Am. J. Phys. 66:465


Ding L., Chabay R., Sherwood B., & Beichner R. (2006). Evaluating an electricity and magnetism assessment tool: Brief Electricity and Magnetism Assessment (BEMA). Phys. Rev. ST Physics Ed. Research 2

Voroshilov V., “Universal Algorithm for Solving School Problems in Physics” // in the  book "Problems in Applied Mathematics and Mechanics". - Perm, Russia, 1998. - p. 57.

Voroshilov V., “Application of Operationally-Interconnect Categories for Diagnosing the Level of Students' Understanding of Physics” // in the book “Artificial Intelligence in Education”, part 1. - Kazan, Russia, 1996. - p. 56.

Voroshilov V., “Quantitative Measures of the Learning Difficulty of Physics Problems” // in the book “Problems of Education, Scientific and Technical Development and Economy of Ural Region”. - Berezniki, Russia, 1996. - p. 85.

<<back to publications<<................... >> to Teach0logy.xyz>>