Data Dictionary

The official ScienceML Data Dictionary.

How to use Measure

ScienceML does the math/science for any historical time-series (with constant cadence) when the data is finite and discrete (always is).  ScienceML produces a set of scientific measurements that are intuitively familiar (energy, power, resistance, temperature, etc.) and that can be seen to drive price dynamics.  Often the prices dynamics is seen to be far from equilibrium.

To invoke ScienceML, two files are needed.

  • Datafile. Customer time-series formatted as a CSV file.
  • Configuration file. Provides additional information on how to process the customer’s time-series.  The configuration file upload triggers Precision Alpha processing.

The two files should be uploaded separately, time-series first, configuration second to a Precision Alpha owned AWS S3 sub-bucket specific to the customer, created during customer on-boarding.  Precision Alpha leverages AWS security infrastructure.  Security policy ensures that no other customer will have access to your processing area.

The amount of historical data to learn from is an important consideration.  More is not necessarily better.  At least 100 timestamps in the datafile ensures fractional fluctuations in the third decimal place.  More timestamps in the datafile, however, tends to reduce the responsiveness to changes in the market.  We recommend that the time-stamps be sampled to produce a time-series between 125 and 1250 timestamps.  The timestamps must have a customer specified cadence (second, hour, day, week, and so on).  The format for the timestamps is also under customer control, however, any customer timestamps conventions that use characters forbidden in filenames will throw an error.

Additional time-series with the same time-stamps can be added to the input columns of the time-series file.  Precision Alpha will process all of them and return them in the output file.   Up to 25 time-series are permitted in each run.

Precision Alpha Measure

Precision Alpha Measure is a “multimeter” that produces exact market measurements from time-series.  Any time-series can be processed (e.g., prices, trading volumes, volatility, sales volumes, inventory, and so on).  Measure is mathematically valid in all markets, both equilibrium and non-equilibrium.

We illustrate the use of Measure with an application: Precision Alpha Exchange (PAEx).  Less than five hours after market close, Exchange scientifically processes every stock on the NYSE and NASD based on six months of closing prices to produce a data file of scientific measurements: next day probabilities that a stock will go up (down), market energy, market power, market resistance, market noise, market temperature, and market free energy (Helmholtz).  The free energy is the energy available to do “value movement work” (‘value’ depends on the specific time-series).

The data schema and I/O format are presented in the table below.  The Measure web service used for Exchange consists of the first two columns (time-stamp and value) in the table below without the headers.  The data is put into a simple comma delimited form for direct processing.   The Measure web service generates output in the format and order in the table below without the headers.

A sample configuration file is reproduced below.  It identifies the output sub-bucket specific to the customer account.  The column headers are identifiers for each time-series.  The column headers do not need to be meaningful (e.g., ColA, ColB, etc.), however the number of column headers is necessary for correct processing.  In the configuration file below, there are two time-series to process: NetValue, NumTx.  And finally, the reservoir temperature for the environment can be specified, T_R.  The default value for T_R is the temperature of statistical equilibrium in our units (e/4).

Precision Alpha Measure Exchange has a limitation.  The Measure Exchange product assumes a trading horizon of about six weeks and is based on the closing prices of stocks.  However, traders are interested in many other asset classes, horizons and time-series values (other than price).  Precision Alpha Measure OnDemand enables innovative and informative market analyses.

Precision Alpha Measure OnDemand uses the same scientific processing engine as Exchange and the same I/O, but OnDemand processes any time-series provided by the customer, that satisfy minimum data quality requirements.

Data Definitions

The data dictionary definitions for ScienceML output are found below.  All values are float with four decimal places of accuracy.

Data Name

Definition

More Information

P+

Computed, machine learned probability that the value of the next measurement will go up.

The probability is a function of the system energy and "process extent" (constraint multiplier).

P-

Computed, machine learned probability that the value of the next measurement will not go up.

The probability is a function of the system energy and "process extent" (constraint multiplier).

Emotion or Energy

Computed, machine learned system Energy measured from the equilibrium energy as a zero offset.  Positive: Bull, Negative: Bear

Scientists often design experiments so that they have constant energy or temperature; these systems are said to be in equilibrium.  Equilibrium implies that the probability that an asset price will go up is equal to the probability that it won't go up (an "unbiased coin"). 

 

Markets, however, are driven by the emotions and actions of market participants.  Markets are not in equilibrium (almost all the time) and do not have constant energy.  The energy is a measure of how far away the data is from statistical equilibrium.

Power

Power is the rate of energy flow per time step. Market power combines Emotion and Resistance.  At equilibrium, Power is equal to Emotion squared divided by R, that is, V^2/R).

Energy is used to do work, or "to move the needle".  Power is the energy flow per time step and is generally not constant.  This measurement calculates the power available to perform price movement work.

Resistance

Market resistance to changing price.

Wherever there is energy available "to move the needle", there is also resistance to moving the needle.  The more resistance, the harder it is "to move the needle".  This measurement calculates the resistive force.  The resistance is not constant in systems that are not in equilibrium.

Noise

Computed, machine learned market (Nyquist) noise that dissipates system Energy so that it cannot be used for price movement.

Power can also be wasted (dissipation) so that it is unavailable to do the work that we want, namely, "to move the needle".  As the noise increases, the amount of wasted power increases.

Temperature

The entropic temperature of the system. 

 

In non-equilibrium dynamics, the free energy and temperature are coupled to produce a heat engine.  By observing the behavior of free energy and temperature, price entry and exit points can be identified.  See Free Energy for more detail.

The temperature is the reciprocal of the derivative of the entropy with respect to energy---this is the general definition of entropic temperature.

 

Recommend plotting free energy and temperature as a double-sided plot.

Free Energy

Helmholtz free energy.

 

Can be used with temperature to identify favorable environments for price movement, and entry and exit points.  See information.

Free Energy.  Tells us when the total energy / emotion is available to do useful work (F = E - TS, Helmholtz).  Minimum Free Energy is a more convenient form of maximum entropy in non-equilibrium problems.  When free energy decreases, it does work in the dominant emotion (bull or bear).  In non-equilibrium, the free energy and temperature can play off each other to generate a heat engine that drives price movements.

 

Local maxima in free energy indicate when energy is available for price movement work (an entry point).  As the free energy decreases from the local maximum, price movement in the direction of the dominant emotion can be observed over an extended time.  After the price movement, a local minimum will develop, equivalent to maximum entropy.  The stable minima signify an exit point for the dominant emotion trade.

ThermP+

Computed, machine learned probability that the value of the measurement will go up in the next time step when in a thermal bath of temperature T_R.

 

Where thermal probabilities dominant over P+ and P- (above), thermal probabilities drive price movement.

 

The thermal probabilities depend on the temperature difference between the system temperature (above) and the reservoir temperature (T_R).  Decision Machine uses a default value for T_R at statistical equilibrium (T_R = e/4) but can (and should) be set by the customer in the configuration file.  Risk analysis and forecasting will both require that the customer measure T_R specifically for their data.

A dissipative system is a thermodynamically open system which is operating out of, and often far from, statistical or thermal equilibrium in an environment with which it exchanges energy and information.

 

The flow of energy into and out of a dissipative system generates thermal probabilities that are calculated from first principles of modern thermodynamics.

 

The output generated by DM for a dissipative system permits a new measurement: the temperature of the reservoir (T_R).  In a dissipative system, when the Temperature and Free Energy (above) are constant, T_R can be directly measured.  Easier to see when plotted.

ThermP-

Computed, machine learned probability that the value of the measurement will not go up in the next time step when in a thermal bath of temperature T_R.

 

Where thermal probabilities dominant over P+ and P- (above), thermal probabilities drive price movement.

 

The thermal probabilities depend on the temperature difference between the system temperature (above) and the reservoir temperature (T_R).  Decision Machine uses a default value for T_R at statistical equilibrium (T_R = e/4) but can (and should) be set by the customer in the configuration file.  Risk analysis and forecasting will both require that the customer measure T_R specifically for their data.

A dissipative system is a thermodynamically open system which is operating out of, and often far from, statistical or thermal equilibrium in an environment with which it exchanges energy and information.

 

The flow of energy into and out of a dissipative system generates thermal probabilities that are calculated from first principles of modern thermodynamics.

 

The output generated by DM for a dissipative system permits a new measurement: the temperature of the reservoir (T_R).  In a dissipative system, when the Temperature and Free Energy (above) are constant, T_R can be directly measured.  Easier to see when plotted.