User Manual 3.4.1 Data management system
Introduction
Scope
The scope of this chapter is to present data management in the OREKIT library. This section presents the three modules of the data management system provided by Orekit :
- how does the data management system of Orekit work?
- how to set it up?
- how to use it?
- how to add data to what already exists?
- what are the pros and cons of this system?
NOTE : this page describes Orekit's data management system assuming it will be used "as is" in PATRIUS- but this is subject to change in later development stages.
Javadoc
The data objects are available in the package org.orekit.data
in the Orekit.
(% style="width: 474px; height: 45px;" %) |=Library|=Javadoc | Orekit|[[[:Modèle:JavaDoc3.4]]/org/orekit/data/package-summary.html Package org.orekit.data]
Links
Modèle:SpecialInclusion prefix=$theme sub section="Links"/
Useful Documents
Modèle:SpecialInclusion prefix=$theme sub section="UsefulDocs"/
Package Overview
The data loading process is organized through three main objects.
The DataProvider
classes handle data sources. Each one of them has a particular type of source it can browse. The DirectoryCrawler
performs a bottom-first search in a directory tree. The ZipJarCrawler
works alike, but inside a compressed file. The ClassPathCrawler
handles a list of data files and/or compressed files that are in the classpath (it can not search recursively like the DirectoryCrawler
though). Finally, the NetworkCrawler
works like the ClassPathCrawler
, although in its case, it has a list of URLs instead of files. There is no limit to the number of DataProviders a program can use at once.
The Providers are listed and put to work through the DataProvidersManager
singleton. This is the single point of access to the data management system. It contains a list of Providers that are queried every time a user needs data.
The various crawlers provide streams to the DataLoader
. From these streams, the DataLoaders can reconstruct data that was stored in files (either compressed or not), even if some files come from different sources. These streams effectively separate the machine world from the program world, because they hide the former to the latter. Therefore, parsing data from a new format only means creating a loader, and being able to read another kind of file means creating a DataProvider
. Note that the DataLoaders usually serve as a facade for the higher layers of the program.
Features Description
Default provider
The data management system can use a system-wide property, orekit.data.path
, as an entry point for default data. This default data must be file-based (either a file system entry point or a java resource) and either a directory or a zip/jar file.
Setting a default provider is not mandatory, and must be done explicitly by :
- setting a value to
orekit.data.path
, - calling
addDefaultProviders
on the data provider manager.
The Orekit library jar contains data that can be used as default data.
Adding a provider
Modèle:SpecialInclusion prefix=$theme sub section="Provider"/
Using the data management system
The data management system main operation is through the feed
method. This method takes a DataLoader
, and a regexp string matching the name of files the DataLoader
is able to process. In this method call :
- the
DataProviders
list is traversed in the priority order. - the first
DataProvider
providing a file matching the regexp is the one (and only) used to feed theDataLoader
.
Adding new data
Modèle:SpecialInclusion prefix=$theme sub section="Adding"/
Getting Started
Modèle:SpecialInclusion prefix=$theme sub section="GettingStarted"/
Contents
Interfaces
The data package includes the following interfaces :
Data
|=(% colspan="3" %)Class|=(% colspan="6" %)Summary|=(% colspan="1" %)Javadoc |(% colspan="3" %)DataLoader|(% colspan="6" %)Interface for loading data files from DataProvider data providers.|(% colspan="1" %)[[[:Modèle:JavaDoc3.4]]/org/orekit/data/DataLoader.html ...] |(% colspan="3" %)DataProvider|(% colspan="6" %)Interface for providing data files to DataLoader file loaders.|(% colspan="1" %)[[[:Modèle:JavaDoc3.4]]/org/orekit/data/DataProvider.html ...]
Classes
The data package includes the following classes :
Data
|=(% colspan="3" %)Class|=(% colspan="6" %)Summary|=(% colspan="1" %)Javadoc |(% colspan="3" %)DataProvidersManager|(% colspan="6" %)This class is the single point of access for all data loading features.|(% colspan="1" %)[[[:Modèle:JavaDoc3.4]]/org/orekit/data/DataProvidersManager.html ...] |(% colspan="3" %)DirectoryCrawler|(% colspan="6" %)This class handles data files recursively starting from a root directories.|(% colspan="1" %)[[[:Modèle:JavaDoc3.4]]/org/orekit/data/DirectoryCrawler.html ...] |(% colspan="3" %)NetworkCrawler|(% colspan="6" %)This class handles a list of URLs pointing to data files or zip/jar on the net.|(% colspan="1" %)[[[:Modèle:JavaDoc3.4]]/org/orekit/data/NetworkCrawler.html ...] |(% colspan="3" %)ZipJarCrawler|(% colspan="6" %)This class browses all entries in a zip/jar archive in filesystem or in classpath.|(% colspan="1" %)[[[:Modèle:JavaDoc3.4]]/org/orekit/data/ZipJarCrawler.html ...]
Tips & Tricks
Strengths
- Lightweight implementation. The providers never load data, they merely provide streams on demand to the loaders.
- Scalable for using data from several heterogeneous sources.
- Scalable for new data types : the user only needs to create a new
DataLoader
implementation to use a new data type in this system.
Weaknesses
- The user must be aware the data loading overhead happens any time a
DataLoader
is fed, so the user must manage its loaders so that they are fed only once. - Several sources for the same type of data cannot be used, since only the last provider added is used to feed data to a loader- unless the user manages the providers list accordingly, knowing one can only add elements or reset the whole list.
- The regexp is the only way to match a data file and a
DataLoader
. - As of today, the data management system is a thread-hostile singleton : a multithreaded application shares the same providers list for all threads, and it may deadlock on a concurrent access!