Data Dictionary

The official ScienceML Data Dictionary.

How to use ScienceML

ScienceML does the math for any historical time-series (with constant cadence) when the data is finite and discrete (always is).  ScienceML produces a set of scientific measurements that are intuitively familiar (energy, power, resistance, temperature, etc.) and that can be seen to drive price dynamics.  Often the prices dynamics is far from equilibrium.

To invoke ScienceML, two files are needed.  The first file to upload is the customer time-series formatted as a CSV.  The second file to upload is the configuration file.  The configuration file provides information on how to process the customer’s time-series, and also triggers Precision Alpha processing on upload.  The two files should be uploaded separately, time-series first, configuration second to a Precision Alpha owned AWS S3 sub-bucket specific to the customer, created during customer subscription.  Precision Alpha leverages the AWS security infrastructure.  Security policy ensures that no other customer will have access to your processing area.

Additional time-series with the same time-stamps can be added to the input columns of the time-series file.  Precision Alpha will process all of them and return them in the output file.   Up to 25 time-series are permitted in each run.

The format for ScienceML input and output is laid out in the Table below.  The timestamps can be any cadence.  The format for the timestamps is also under customer control, however, any customer timestamps conventions that use characters forbidden in filenames will throw an error.

 

A sample configuration file is reproduced below.  It identifies the output sub-bucket specific to the subscription account.  The column headers are identifiers for each time-series.  The column headers do not need to be meaningful (e.g., ColA, ColB, etc.), however the number of column headers is necessary for correct processing.  In the configuration file below, there are two time-series to process: NetValue, NumTx.  And finally, the reservoir temperature for the environment can be specified, T_R.  The default value for T_R is the temperature of statistical equilibrium in our units (e/4).

Data Definitions

The data dictionary definitions for ScienceML output are found below.  All values are float with four decimal places of accuracy.

Data Name

Definition

More Information

P+

Computed, machine learned probability that the value of the next measurement will go up.

The probability is a function of the system energy and "process extent" (constraint multiplier).

P-

Computed, machine learned probability that the value of the next measurement will not go up.

The probability is a function of the system energy and "process extent" (constraint multiplier).

Emotion or Energy

Computed, machine learned system Energy measured from the equilibrium energy as a zero offset.  Positive: Bull, Negative: Bear

Scientists often design experiments so that they have constant energy or temperature; these systems are said to be in equilibrium.  Equilibrium implies that the probability that an asset price will go up is equal to the probability that it won't go up (an "unbiased coin"). 

 

Markets, however, are driven by the emotions and actions of market participants.  Markets are not in equilibrium (almost all the time) and do not have constant energy.  The energy is a measure of how far away the data is from statistical equilibrium.

Power

Power is the rate of energy flow per time step. Market power combines Emotion and Resistance.  At equilibrium, Power is equal to Emotion squared divided by R, that is, V^2/R).

Energy is used to do work, or "to move the needle".  Power is the energy flow per time step and is generally not constant.  This measurement calculates the power available to perform price movement work.

Resistance

Market resistance to changing price.

Wherever there is energy available "to move the needle", there is also resistance to moving the needle.  The more resistance, the harder it is "to move the needle".  This measurement calculates the resistive force.  The resistance is not constant in systems that are not in equilibrium.

Noise

Computed, machine learned market (Nyquist) noise that dissipates system Energy so that it cannot be used for price movement.

Power can also be wasted (dissipation) so that it is unavailable to do the work that we want, namely, "to move the needle".  As the noise increases, the amount of wasted power increases.

Temperature

The entropic temperature of the system. 

 

In non-equilibrium dynamics, the free energy and temperature are coupled to produce a heat engine.  By observing the behavior of free energy and temperature, price entry and exit points can be identified.  See Free Energy for more detail.

The temperature is the reciprocal of the derivative of the entropy with respect to energy---this is the general definition of entropic temperature.

 

Recommend plotting free energy and temperature as a double-sided plot.

Free Energy

Helmholtz free energy.

 

Can be used with temperature to identify favorable environments for price movement, and entry and exit points.  See information.

Free Energy.  Tells us when the total energy / emotion is available to do useful work (F = E - TS, Helmholtz).  Minimum Free Energy is a more convenient form of maximum entropy in non-equilibrium problems.  When free energy decreases, it does work in the dominant emotion (bull or bear).  In non-equilibrium, the free energy and temperature can play off each other to generate a heat engine that drives price movements.

 

Local maxima in free energy indicate when energy is available for price movement work (an entry point).  As the free energy decreases from the local maximum, price movement in the direction of the dominant emotion can be observed over an extended time.  After the price movement, a local minimum will develop, equivalent to maximum entropy.  The stable minima signify an exit point for the dominant emotion trade.

ThermP+

Computed, machine learned probability that the value of the measurement will go up in the next time step when in a thermal bath of temperature T_R.

 

Where thermal probabilities dominant over P+ and P- (above), thermal probabilities drive price movement.

 

The thermal probabilities depend on the temperature difference between the system temperature (above) and the reservoir temperature (T_R).  Decision Machine uses a default value for T_R at statistical equilibrium (T_R = e/4) but can (and should) be set by the customer in the configuration file.  Risk analysis and forecasting will both require that the customer measure T_R specifically for their data.

A dissipative system is a thermodynamically open system which is operating out of, and often far from, statistical or thermal equilibrium in an environment with which it exchanges energy and information.

 

The flow of energy into and out of a dissipative system generates thermal probabilities that are calculated from first principles of modern thermodynamics.

 

The output generated by DM for a dissipative system permits a new measurement: the temperature of the reservoir (T_R).  In a dissipative system, when the Temperature and Free Energy (above) are constant, T_R can be directly measured.  Easier to see when plotted.

ThermP-

Computed, machine learned probability that the value of the measurement will not go up in the next time step when in a thermal bath of temperature T_R.

 

Where thermal probabilities dominant over P+ and P- (above), thermal probabilities drive price movement.

 

The thermal probabilities depend on the temperature difference between the system temperature (above) and the reservoir temperature (T_R).  Decision Machine uses a default value for T_R at statistical equilibrium (T_R = e/4) but can (and should) be set by the customer in the configuration file.  Risk analysis and forecasting will both require that the customer measure T_R specifically for their data.

A dissipative system is a thermodynamically open system which is operating out of, and often far from, statistical or thermal equilibrium in an environment with which it exchanges energy and information.

 

The flow of energy into and out of a dissipative system generates thermal probabilities that are calculated from first principles of modern thermodynamics.

 

The output generated by DM for a dissipative system permits a new measurement: the temperature of the reservoir (T_R).  In a dissipative system, when the Temperature and Free Energy (above) are constant, T_R can be directly measured.  Easier to see when plotted.