Oracle8 Time Series Cartridge User's Guide Release 8.0.4 A57501-01 |
|
This chapter explains concepts related to the Oracle8 Time Series Cartridge, and it provides information on using the cartridge. It covers the following topics:
A time series is a set of timestamped data entries. A time series allows a natural association of data collected over intervals of time. For example, summaries of stock market trading or banking transactions are typically collected daily, and are naturally modeled with time series.
A time series can be regular or irregular, depending on how predictably data points arrive or occur:
Data generation for a time series begins with individual transactions, such as trades on a stock exchange or purchases of products. Each transaction has a timestamp and sufficient information to identify that transaction uniquely (such as a stock ticker or a product ID), as well as other pertinent information (such as the price and information to identify the party initiating the purchase or sale).
Individual transaction data is typically rolled up to produce summary data for a meaningful time period, such as a daily summary indicating the trade volume and the opening, high, low, and closing prices for each stock traded that day. This summary data is collected to produce historical data, such as a table of all daily volumes and opening, high, low, and closing prices for all stocks traded for the year 1997. For example, Figure 2-1 shows how data related to securities on a stock exchange is generated.
In Figure 2-1, each trade on the stock exchange includes several items of information, including a ticker and a price (for example, stock XYZ at 37.50). The daily summary data includes the opening, high, low, and closing prices for each ticker (for example, for XYZ: 37.75, 38.25, 37.00, 37.625). The daily data for each ticker is appended to the historical data for the ticker. The daily data is used for such purposes as quote server applications and listing in the next day's newspapers; the historical data is used by such applications as price and volume charting and technical analysis.
The data-collection model for historical data has the following characteristics:
This historical data is modeled using multiple regular time series.
The Time Series cartridge and the Oracle8 utilities, with their bulk-loading capabilities and transactional semantics, are well suited for the requirements of time series data generation.
The Time Series cartridge is especially useful in dealing with historical data. This type of data typically has relatively simple metadata but massive data storage requirements. That is, the data attributes (columns) are relatively few and easy to understand (such as ticker, volume, and opening, high, low, and closing prices); however, the number of rows is enormous (for example, data for all listed stocks for all trading days for several years). Moreover, the number of functions that users might want to perform on the data is large: for example, finding various sums, counts, maximum and minimum values, averages, number of trading days between two dates, moving average, and so on.
Figure 2-2 shows an example of historical data stored in a database.
Stock market historical databases have the following general characteristics:
This kind of financial historical data is used in examples in this manual and in the usage demo (see Section 1.6) provided with the Time Series cartridge.
A calendar maps human-meaningful time values to underlying machine representations of time. Typically, a calendar is associated with a time series.
For example, a business day calendar can define the days of the week on which stocks are traded. The holidays when trading does not occur are also in the calendar. The following are key components of a calendar:
A frequency specifies the granularity of the calendar representation. Examples of frequencies are second, minute, hour, day, month, and year.
The pattern specifies the repeating pattern of frequencies and an anchor date that identifies the date of an occurrence of the frequency. For example, if the frequency is set to day, the pattern can define which days of the week are included in the calendar. For example, a pattern of `0,1,1,1,1,1,0' over a day frequency defines a calendar over all weekdays. If an anchor date of 01-Jun-1997 (or any Sunday) is specified, then the 7-day pattern begins each Sunday; and Sunday and Saturday (0) are excluded from the calendar, while Monday through Friday (1) are included in the calendar.
Exceptions are timestamps that do not conform to the calendar pattern but that are significant for the calendar definition. There are two kinds of exceptions: off-exceptions and on-exceptions:
On-exceptions can also be used without a pattern, that is, when a zero pattern is specified for the time series. An irregular time series can be constructed by defining a frequency, a zero (0) pattern, and an on-exceptions list. In this case, the on-exceptions list defines the timestamps that constitute the irregular time series. An example of such an irregular time series is a calendar of dates for quarterly dividend payments or earnings announcements.
Each frequency has an associated integer code that is used in function calls. Table 2-1 lists the supported frequencies and their integer codes.
Frequency (Every:) | Integer Code |
---|---|
second | 1 |
minute | 2 |
hour | 3 |
day | 4 |
month | 6 |
year | 8 |
Each frequency has an associated precision. Time Series cartridge functions require that input timestamps be of the precision of the frequency associated with the calendar. (The SetPrecision function is the exception: this function takes a frequency and a timestamp and returns a timestamp that conforms to the associated calendar.)
A timestamp that is not consistent with the frequency is said to be imprecise. For example, a timestamp of 09-Sep-1997 is imprecise if it is input to a function that is dealing with a calendar whose frequency is 6 (month) or 8 (year). When you create a calendar, all timestamps used in the calendar definition (the anchor date for the pattern, the minDate and maxDate, and all off- and on-exceptions) must be precise with respect to the frequency. For example, the calendar will not be valid if you specify a frequency of month and an anchor date (patAnchor) of 02-Jan. (The calendar datatypes and their attributes are presented in Section 2.2.2.)
Table 2-2 shows the frequencies, their precision conventions, and an example timestamp of each precision.
The Time Series cartridge provides the following calendar datatypes. (Time series datatypes are described in Section 2.8.1.)
CREATE TYPE ORDSYS.ORDTCalendar AS OBJECT ( caltype INTEGER, name VARCHAR2(256), frequency INTEGER, pattern ORDSYS.ORDTPattern, minDate DATE, maxDate DATE, offExceptions ORDSYS.ORDTExceptions, onExceptions ORDSYS.ORDTExceptions);
CREATE TYPE ORDSYS.ORDTPatternBits AS VARRAY(32500) OF INTEGER; CREATE TYPE ORDSYS.ORDTPattern AS OBJECT ( patBits ORDSYS.ORDTPatternBits, patAnchor DATE);
All Time Series cartridge datatypes are installed under the ORDSYS schema, and all users must include the ORDSYS schema name when referring to these datatypes.
To define a calendar, you create a table in which to store calendar definitions and then store a row for each calendar to be defined.
The following example creates a table named stockdemo_calendars and defines a calendar named BusinessDays. The BusinessDays calendar includes Mondays through Fridays, but excludes 28-Nov-1996 and 25-Dec-1996. Explanatory notes follow the example.
CREATE TABLE stockdemo_calendars of ORDSYS.ORDTCalendar ( name CONSTRAINT calkey PRIMARY KEY); INSERT INTO stockdemo_calendars VALUES( ORDSYS.ORDTCalendar( 0 'BusinessDays', 4, ORDSYS.ORDTPattern( ORDSYS.ORDTPatternBits(0,1,1,1,1,1,0), TO_DATE('01-JAN-1995','DD-MON-YYYY')), TO_DATE('01-JAN-1990','DD-MON-YYYY'), TO_DATE('01-JAN-2001','DD-MON-YYYY'), ORDSYS.ORDTExceptions(TO_DATE('28-NOV-1996','DD-MON-YYYY'), TO_DATE('25-DEC-1996','DD-MON-YYYY')), ORDSYS.ORDTExceptions() ));
Notes on Example 2-1:
When you want to create calendars that conform to time series data, you can use the DeriveExceptions function to simplify the process. You can use one of two approaches with DeriveExceptions, depending on your needs and the requirements for each approach:
While the first approach can be performed in a single step, the second approach requires an additional step (before DeriveExceptions is called) in order to construct the first time series.
Although the first approach is simpler in practice, the second approach has significant performance advantages when you need to define multiple calendars that have the same frequency and pattern but different exception lists. The first approach is less efficient than the second approach in this case, because the internal implementation of the first approach generates a collection of dates based on the input calendar parameter. If you need to derive exceptions for multiple calendars defined on the same frequency and pattern, this date-generation operation is performed multiple times. You can avoid these multiple date-generation operations by using the second approach.
Section 3.3 contains more detailed information about using each approach to deriving calendar exceptions.
Figure 2-3 shows the Time Series cartridge architecture. At the lowest level, a storage option is required, and for the initial release this must be a flat index-organized table (IOT). For future releases, nested IOT and VARRAY storage options are planned. The actual cartridge consists of PL/SQL packages for calendar, time series, and time scaling functions. In addition, a collection-based interface between time series storage and the packaged functions is provided.
The rest of this chapter describes this architecture, working from bottom to top in Figure 2-3:
The chapter concludes with the steps for using the Time Series cartridge.
The underlying storage model must satisfy the following requirements for time series data:
A time series is stored as multiple rows in a flat index-organized table (IOT).1 Each row stores a ticker, a timestamp, and composite data. This storage option is shown in Figure 2-2.
The flat IOT storage model provides efficient utilization of disk storage (for example, the timestamp data is stored once per composite entry) and allows flexibility in queries; however, it does require that users perform certain manual actions:
The interfaces to the time series and time scaling functions rely on the following aspects of the Time Series cartridge architecture:
Two basic interfaces to time series and time scaling functions are defined:
In the instance-based interface, the first input parameter to a time series function is an instance of a time series (for example, ORDTNumSeries).
In the reference-based interface, the first input parameter to a time series function is a reference to a time series (for example, ORDTNumSeriesIOTRef). The reference-based interface requires that you provide enough descriptive information to enable the functions to execute dynamic SQL to obtain an instance of a time series.
The datatypes related to the instance-based and reference-based interfaces (for example, ORDTNumSeries and ORDTNumSeriesIOTRef) are discussed in Sections 2.5.1 and 2.5.2.
Note that both types of interfaces return only instances of time series (for example, ORDTNumSeries). Also, because nesting of time series functions is allowed (for example, SELECT (Lead(Mavg, ...) ...)), the instance-based interface is used internally for the second and subsequent levels of nesting.
When possible, you should use the reference-based interface. Although this interface may be difficult to understand initially, it offers significant performance advantages over the instance-based interface. The examples in this manual emphasize the reference-based interface.
Time series functions operate on instances of time series objects (for example, an ORDTNumSeries). An instance of a time series object includes a name field, an instance of a calendar, and an instance of a time series. For example, as the following type definitions for a numeric time series show, ORDTNumTab defines a collection and ORDTNumSeries bundles a calendar instance with a collection:
CREATE TYPE ORDSYS.ORDTNumCell AS OBJECT (tstamp DATE, value NUMBER); CREATE TYPE ORDSYS.ORDTNumTab AS TABLE OF ORDTNumCell; CREATE TYPE ORDSYS.ORDTNumSeries AS OBJECT ( name VARCHAR2(256), cal ORDTCalendar, series ORDTNumTab );
For a numeric time series, the time series data is contained in the ORDTNumTab structure. This structure is a table of a DATE column and a NUMBER column, and is also known as a collection.
Figure 2-4 shows an example of an ORDTNumTab collection type
Tstamp | Value |
---|---|
01-01-1996 |
22.00 |
01-02-1996 |
23.00 |
... |
... |
12-31-1996 |
... |
Functions such as Mavg (Moving Average, described in Section 2.8.9) use the ORDTNumTab structure as the source data for performing computations, and they use the ORDTCalendar type to enable navigation through the time series data. The calendar-based navigation is especially useful for functions such as Mavg, which has as input parameters the starting date (startDate) and ending date (endDate) for which to return moving averages and an integer (k) indicating the look-back window (k denoting the number of timestamps, including the current one, over which to compute the moving average). Calendar-based navigation is used to determine the date that is k-1 timestamps previous to startDate.
Although time series functions operate on time series instances, they are invoked from SQL using a REF to a time series. For a numeric time series, this type is an ORDTNumSeriesIOTRef. (Section 2.5.2 explains the use of REFs in the reference-based interface.) The REF contains enough information so that time series functions can derive the instance (ORDTNumSeries) at runtime (using dynamic SQL).
The convention of defining an interface on a DATE column and a single NUMBER column provides a uniform interface for time series functions. Because the underlying IOT that stores time series data may have multiple NUMBER columns, the view defining the REF also maps the underlying storage to conform to the two-column interface defined by the ORDTNumSeries type.
The following are the key aspects of the instance-based interface to time series functions:
The Time Series cartridge provides a reference-based interface for time series and time scaling functions.
This interface provides efficient performance, especially when only a portion of the time series is accessed. The performance benefit of this interface results from the fact that at runtime the reference-based interface materializes only those rows within the specified date range, as opposed to materializing the entire collection of rows from the time series.
The reference-based interface uses the ORDTNumSeriesIOTRef and ORDTVarchar2SeriesIOTRef types, which include a REF to a calendar, plus several literal values. At runtime, reference-based time series functions use these literal values to form and execute a SQL statement (using dynamic SQL) that derives an instance of a time series that contains only the timestamps needed for this instance. The time series function determines which timestamps are needed based on the minDate and maxDate parameters to the function.
The ORDTNumSeriesIOTRef type is defined as follows:
CREATE TYPE ORDSYS.ORDTNumSeriesIOTRef AS OBJECT ( name VARCHAR2(256), cal REF ORDSYS.ORDTCalendar, table_name VARCHAR2(256), tstamp_colname VARCHAR2(30), value_colname VARCHAR2(30), qualifier_colname VARCHAR2(30), qualifier_value VARCHAR2(4000) );
The attributes of the ORDTNumSeriesIOTRef type are as follows:
In the Time Series cartridge usage demo, the view stockdemo_ts uses the reference-based interface to time series functions. The stockdemo_ts view determines which calendar should be coupled with the time series by accessing the calendar (stockdemo_calendars) and metadata (stockdemo_metadata) tables. The pricing data is accessed through the underlying table containing historical time series pricing data (stockdemo). For an illustration of the relationship between the reference-based view and the underlying tables in the Time Series cartridge usage demo, see Figure 1-1 in Chapter 1.
The stockdemo_ts view is defined as follows:
CREATE OR REPLACE VIEW stockdemo_ts(ticker,open,high,low,close,volume) AS SELECT meta.tickername, ORDSYS.ORDTNumSeriesIOTRef( substr(meta.tickername, 1, 230) || ' open NumSeries', Ref(cal), 'ordtdev.stockdemo', 'tstamp', 'open', 'ticker', meta.tickername), ORDSYS.ORDTNumSeriesIOTRef( substr(meta.tickername, 1, 230) || ' high NumSeries', Ref(cal), 'ordtdev.stockdemo', 'tstamp', 'high', 'ticker', meta.tickername), ORDSYS.ORDTNumSeriesIOTRef( substr(meta.tickername, 1, 230) || ' low NumSeries', Ref(cal), 'ordtdev.stockdemo', 'tstamp', 'low', 'ticker', meta.tickername), ORDSYS.ORDTNumSeriesIOTRef( substr(meta.tickername, 1, 230) || ' close NumSeries', Ref(cal), 'ordtdev.stockdemo', 'tstamp', 'close', 'ticker', meta.tickername), ORDSYS.ORDTNumSeriesIOTRef( substr(meta.tickername, 1, 230) || ' volume NumSeries', Ref(cal), 'ordtdev.stockdemo', 'tstamp', 'volume', 'ticker', meta.tickername) FROM stockdemo_metadata meta, stockdemo_calendars cal WHERE meta.calendarname = cal.name;
Depending on which column is selected, a different literal value is applied as an attribute of the ORDTNumSeriesIOTRef type. For example, for the following query:
SELECT ORDSYS.TimeSeries.Mavg(close, to_date('02-DEC-96','DD-MON-YY'), to_date('31-DEC-96','DD-MON-YY'), 10) FROM ORDTDEV.stockdemo_ts WHERE ticker='ACME';
The literal value close is used as the value_colname column name. The other attributes of the ORDTNumSeriesIOTRef type include the timestamp column name (tstamp), a qualifying column name (ticker), and the actual value of the qualifying column (meta.tickername).
The implementation of time series functions uses the information stored in the ORDTNumSeriesIOTRef type to generate the appropriate dynamic SQL statement at runtime. Using the preceding example, to instantiate a time series object (that is, to convert an ORDTNumSeriesIOTRef to an ORDTNumSeries), the Mavg function generates a query that performs the following action (with the logic shown, not the exact syntax):
SELECT tstamp, close FROM ordtdev.stockdemo_ts WHERE ticker='ACME' and tstamp BETWEEN <a date range adjusted to reflect the 10-day window and the calendar, including any holidays>;
The Mavg function computes the moving average and returns the result as a time series instance (ORDTNumSeries). For more information about the Mavg function, see Section 2.8.9.
Most time series and time scaling functions rely on calendars that are consistent with time series data.2 By assuming a time series is consistent with its calendar, time series and time scaling functions can use the calendar as a basis for navigation of time series data.
Time series consistency must be maintained; otherwise, functions might raise exceptions or return incorrect results.
For a time series to be consistent, the following must be true:
If some mechanism is not used to enforce these consistency rules, accidental or malicious actions could destroy the integrity of the time series data. For example, a user might delete rows from the middle of the time series, rather than being restricted to deleting rows at the beginning and the end of the date range for the time series.
Enforcing time series consistency can be accomplished with a security view. A security view is a relational view of time series data that uses INSTEAD OF triggers to maintain time series consistency. (For an explanation of INSTEAD OF triggers, see the Oracle8 Server Concepts manual.) The security view is intended to be used for limited or moderate insert, update, and delete operations; it is not intended for bulk changes to time series data.
The cartridge demo (see Section 1.6) includes a security view defined in the file demo/usage/securevw.sql. This security view:
Timestamps are purified to match the precision of the calendar. For example, for a calendar with a day frequency, any hour, minute, and second values in the input timestamp are set to zero. Only purified timestamps are inserted into a time series, and timestamps are purified if necessary before insert and delete operations.
INSTEAD OF triggers enforce rules on insert, delete, and update operations. These rules maintain time series data that conforms to the associated calendar.
For insert operations, the following rules apply:
For delete operations, the following rules apply:
For update operations, the following rules apply:
INSTEAD OF triggers in a security view enable you to ensure that a time series meets the consistency requirements described in Section 2.6.1.
INSTEAD OF triggers allow for multiple timestamps to be inserted or deleted in a single query, given that the group of timestamps inserted or deleted are in the proper order. For example, a specified number of timestamps can be deleted from the beginning of a time series by using a simple range restriction on the timestamp. A specified number of timestamps can be inserted at the end of a time series by using a subquery that references another table containing time series data.
The SQL*Loader utility is useful for loading large amounts of data into a table. For better performance, you should perform bulk loads on underlying tables instead of on security views. However, after you load data into the tables, you must ensure time series consistency by using one of the following approaches:
If you are sure that all timestamps are correct, it is safe to adjust the calendar to be consistent with the time series. This strategy is normally appropriate when there is a unique calendar per time series.
The DeriveExceptions function is useful for adjusting a calendar to be consistent with the time series.
If you expect time series data to adhere to a predefined calendar, validating each time series is the better approach. This approach is particularly useful if the same calendar is used for all time series data being loaded.
The IsValidTimeSeries function can be used to check if the time series is consistent with the calendar.
For better performance in the case of a shared calendar for all time series, you may want to customize time series validation using PL/SQL. This involves writing custom utility functions that call Time Series cartridge product-developer calendar functions (see Section 2.7.2) to test and maintain time series consistency.
Section 3.2 contains additional information and examples of bulk and incremental loading of time series data.
The Time Series cartridge provides calendar functions for querying and modifying calendars. The calendar functions can be divided into the following categories:
Reference information for all calendar functions is in Chapter 4.
End-user functions let you use the main calendar-related features of the Time Series cartridge. If you do not need to modify or expand the Time Series cartridge capabilities, you probably can limit your use of calendar functions to those listed in Table 2-3.
Product-developer functions let you modify and expand the Time Series cartridge capabilities. For example, you could use product-developer calendar functions in creating a new function that modified the information returned for the moving average or that returned a net present value for a portfolio of stocks at a specified date.
Table 2-4 lists the product-developer calendar functions.
For an example of using product-developer functions, see Section 3.4.
Time series functions operate on a time series. A time series type is always used as the input parameter to a time series function.
Reference information for all time series functions is in Chapter 5.
Time series functions are defined over datatypes that contain a calendar and a collection. The Time Series cartridge provides the following time series datatypes. (Calendar datatypes are described in Section 2.2.2.)
CREATE TYPE ORDSYS.ORDTNumCell AS OBJECT (tstamp DATE, value NUMBER); CREATE TYPE ORDSYS.ORDTNumTab AS TABLE OF ORDSYS.ORDTNumCell; CREATE TYPE ORDSYS.ORDTNumSeries AS OBJECT ( name VARCHAR2(256), cal ORDSYS.ORDTCalendar, series ORDSYS.ORDTNumTab ); CREATE TYPE ORDSYS.ORDTNumSeriesIOTRef AS OBJECT ( name VARCHAR2(256), cal REF ORDSYS.ORDTCalendar, table_name VARCHAR2(256), tstamp_colname VARCHAR2(30), value_colname VARCHAR2(30), qualifier_colname VARCHAR2(30), qualifier_value VARCHAR2(4000) ); CREATE TYPE ORDSYS.ORDTVarchar2Cell AS OBJECT (tstamp DATE, value VARCHAR2(4000)); CREATE TYPE ORDSYS.ORDTVarchar2Tab AS TABLE OF ORDSYS.ORDTVarchar2Cell; CREATE TYPE ORDSYS.ORDTVarchar2Series AS OBJECT ( name VARCHAR2(256), cal ORDSYS.ORDTCalendar, series ORDSYS.ORDTVarchar2Tab ); CREATE TYPE ORDSYS.ORDTVarchar2SeriesIOTRef AS OBJECT ( name VARCHAR2(256), cal REF ORDSYS.ORDTCalendar, table_name VARCHAR2(256), tstamp_colname VARCHAR2(30), value_colname VARCHAR2(30), qualifier_colname VARCHAR2(30), qualifier_value VARCHAR2(4000) ); CREATE TYPE ORDSYS.ORDTDateTab AS TABLE OF DATE;
The preceding statements show the definition of a numeric time series and a character time series (instance-based and reference-based interfaces), each composed of a calendar instance and a collection. The collection (ORDTxxxTab) is defined as a table of ORDTxxxCell (except for ORDTDateTab, which is a table of DATE). Time Series cartridge datatypes, such as ORDTNumSeries and ORDTVarchar2Series, are input and output parameters of time series functions.
For time series functions that accept two time series, both time series must be defined on calendars that have the same frequency and the same pattern. The calendars may have different exceptions lists and different starting and ending dates.
A number of time series functions perform arithmetic, comparison, and grouping operations. When nulls are encountered in this context, the default behavior is to mirror SQL:
For example, the sum of (1, NULL, NULL, 3) returns 4. The sum of (NULL, NULL, NULL, NULL) returns null.
For example, if there are 5 nulls in the last 30 timestamps for (and including) a specific date, the 30-day moving average on that date is computed using only 25 values (that is, adding only the non-null values and dividing by 25). However, if all 30 dates (the date and the 29 previous dates) have nulls, the moving average for that date is null.
For example, 10 + NULL returns null.
For example, a GT comparison of 30-Jun-1997 and null returns null.
Note that because PL/SQL does not implement UNKNOWN, these semantics are slightly different than the SQL treatment of comparisons with nulls. In SQL, a comparison operator that encounters a null returns UNKNOWN, which is like a null, except that operations on UNKNOWN return UNKNOWN.
For example, if you are scaling up daily data from 01-Jan-1997 through 30-Jun-1997 to monthly data, and if there are no values for the month of February, a null is returned for February and scaled data is returned for the other months. (Note that this behavior differs from the standard GROUP BY scaling in SQL, in which February would be missing in the scaled results.)
Some functions allow alternate semantics in the form of an option. The reference information for each function describes any alternate semantics options.
In comparisons of two time series, it is possible that a timestamp valid for one time series is not valid for the other time series. Operations on two time series having similar calendars return a time series that is defined over a new calendar. This new calendar is derived from the two input calendars, using all of the following:
For example, assume the following two calendars:
The new (derived) calendar is: 01-Feb-1997 through 01-Dec-1997; daily pattern '0,1,1,1,1,1,0' (Monday through Friday), off-exceptions 01-May and 14-Jul; on-exception 29-Jun.
Time series extraction, retrieval, and trim functions operate on any time series type. Extraction functions return one or more time series rows, while retrieval and trim functions return a time series.
Table 2-5 lists the extraction functions.
Table 2-6 lists the retrieval and trim functions.
Shift functions lead or lag a time series by a specified number of units, where units reflects the frequency of the calendar for the time series.
Function | Description |
---|---|
Lead | Leads a time series by the specified number of units. |
Lag | Lags a time series by the specified number of units. |
When called from a SQL SELECT expression, a time series function returns an instance of a time series datatype, which is not displayable. The SQL formatting functions facilitate format conversions that allow time series to be displayed.
Aggregate functions return scalar or ORDTNumTab values. Each aggregate function can be used in either of the following ways:
Thus, each aggregate function is of the form:
f(ts ORTDNumSeries, [date1 DATE, date2 DATE])
Arithmetic functions accept two time series (ORDTNumSeries1,ORDTNumSeries2) or a time series and a constant (ORDTNumSeries1, Const), and perform a pairwise arithmetic operation on each element of the time series. This operation determines the value of each element of the returned time series:
Algorithm for f(ts1, ts2) ForAll i, tsRet(i) = ts1(i) op ts2(i);
Function | Description |
---|---|
TSAdd | Time series addition |
TSDivide | Time series division |
TSMultiply | Time series multiplication |
TSSubtract | Time series subtraction |
Cumulative sequence functions operate on successive elements of a time series, accumulating the result into the current element of the output time series. For example, CSUM((1,2,3,4,5)) => (1,3,6,10,15). In this example, the result time series (f(i)), is computed from the input time series (I(i)) as follows:
f(1) = I(1) ForAll i > 1, f(i) = f(i - 1) + I(i)
Function | Returns |
---|---|
Cavg | Cumulative average |
Cmax | Cumulative maximum |
Cmin | Cumulative minimum |
Cprod | Cumulative product |
Csum | Cumulative sum |
The Moving Average (Mavg) function returns a time series that contains the averages of values from each successive timestamp for a specified interval over a range of dates. For example, the 30-day moving average for a stock is the average of the closing price for the specified date and the 29 trading days preceding it.
The Moving Sum (Msum) function returns a sum of values from each successive timestamp for a specified interval over a range of dates. For example, the 30-day moving sum of trading volumes for a stock is the sum of the volume for the specified date and for 29 trading days preceding it.
Function | Returns |
---|---|
Mavg | Moving average |
Msum | Moving sum |
The relationship between the input and output time series in the computation of a moving average or sum is illustrated in Figure 2-5. The figure focuses on the common invocation of moving average or sum, where k is the number of timestamps in the look-back window (for example, 30) and a date range (startDate and endDate) is supplied. (For more information about the parameters, see the Mavg function description in Chapter 5.)
Conversion functions fill missing elements of a numeric time series (ORDTNumSeries). Missing elements are those where their timestamps are defined by the calendar and are in the range of the current time series, but they are not currently in the time series.
Function | Description |
---|---|
Fill | Fills a time series based on the calendar and fill type. |
The Time Series cartridge provides functions to scale up time series data. Scaleup functions produce summary information from finer granularity information, for example, monthly data based on daily data. Scaleup is also known as rollup.
The relationship between the input and output time series in a scaleup operation is illustrated in Figure 2-6, which shows a mapping when scaling from a daily frequency to a monthly frequency.
Figure 2-6 shows all days in February being mapped to the month of February. This mapping also suggests the importance of the precision of timestamps of different frequencies. In the example shown in this figure:
Two interfaces to time scaling are supported: the collection-based interface (operations on collections) and the GROUP BY interface (SQL GROUP BY clause). Section 2.9.1 discusses the collection-based interface for time scaling, and Section 2.9.2 discusses the GROUP BY interface.
Note: You should use the collection-based interface for most time scaling queries. Although the GROUP BY interface is useful for certain advanced queries (see Section 2.9.2), the collection-based interface offers much better performance in most cases. |
The scaleup functions accept as input a numeric time series and a destination calendar. A numeric time series is returned, which is scaled based on the destination calendar.
For example, the following statement returns the last closing prices for stock SAMCO for the months of October, November, and December of 1996:
select * from the (select cast(ORDSYS.TimeSeries.ExtractTable( ORDSYS.TimeSeries.ScaleupLast( ts.close, sc.calendar, to_date('01-OCT-1996','DD-MON-YYYY'), to_date('01-JAN-1997','DD-MON-YYYY') ) ) as ORDSYS.ORDTNumTab) from ordtdev.stocks_ts ts, ordtdev.scale sc where ts.ticker='SAMCO' and sc.name ='MONTHLY');
This example might produce the following output:
TSTAMP VALUE --------- ---------- 01-OCT-96 42.375 01-NOV-96 38.25 01-DEC-96 39.75 3 rows selected.
Note that each timestamp reflects the first date of the month in the calendar (following the rules explained in Section 2.2.1), and each value in this case reflects the closing price on the last date for that month in the calendar.
Scaleup functions ignore nulls. For example, ScaleupAvg returns a time series reflecting the average value of each scaled group of non-null values.
Time scaling in the GROUP BY clause supports statements such as the following:
SELECT sum(volume), max(high), min(low) FROM StockTab, CalendarTab cal WHERE ticker = 'XYZ' AND cal.name = 'Monthly' GROUP BY ORDSYS.TimeSeries.Scaleup(tstamp, cal)
This statement scans the daily data stored in StockTab, and sums the volume attribute on a monthly basis. The calendar to be scaled up to is a parameter of the Scaleup function, and is extracted from a table of calendars, CalendarTab, which is of the form:
CREATE TABLE CalendarTab of ORDTCalendar;
The Scaleup function accepts a timestamp and a calendar, and returns a timestamp. If the input timestamp is a valid timestamp of the calendar, the input timestamp is returned; otherwise, the closest timestamp in that calendar that precedes the input timestamp is returned.
Only SQL aggregate functions are supported in the GROUP BY interface.
1
A time series could be stored in a standard table; however, for performance reasons it is highly recommended that you use an IOT rather than a standard table.
2
An exception is the Fill function, which can be used to add pairs of timestamps and values to make a time series consistent with the calendar.