Data Sources Determination

The selection of data sources for inclusion in WellLine is very important and non-trivial. There are many sources that make logical sense to include in the TimeLine Service and thus the WellLine Knowledge Application, while others may not add enough value and/or be unfit for inclusion.

The process of selecting data varies greatly per-installation. Below are some general questions the WellLine team considers during this process:

Is the data legally accessible by a 3rd party application over a network?

  • This is straightforward criteria which must be met in order to include a data source in the product

  • Data must be accessible legally, meaning all sign-offs have been achieved for its transmission to a 3rd party application (i.e. WellLine) including the storage of portions of the source

  • WellLine by default is installed on Azure and requires network connectivity to function; even if the product is installed with only intranet accessibility, data must still be uploaded to it over the network

  • Consider all sources of data that are or will be uploaded to the product, e.g. would a combination of sources pose a problem?

Does the data have a clear association to a time frame?

  • One of the few required fields in the WellLine Data Model is startedOn, a datetime field: this is because all events live on the TimeLine and need a date to be included there

  • The idea of "a clear association" means that each potential event within the source system should have a direct correlation to a single or range of dates/datetimes

Can the data be directly associated with at least one asset/entity of interest?

  • Events with no associated entities will be hard to discover within the WellLine Knowledge Application, since a primary search method is via our autocomplete dropdown (which is populated with entities)

  • Examples of "asset/entity of interest" could include: one or more wells; one or more rigs; one or more fields; one or more people

  • When evaluating this question, consider how a user might want to discover the data within WellLine

    • For example, if users are accustomed to seeing data from a well-centric view, it might be best to include data that can be directly associated with a well

  • Remember that while all events containing unstructured text will be run through the WellLine natural language processing (NLP) pipeline for entity extraction, not all events will contain free text entities to extract, so including at least one "structured"/known entity during upload will ensure it can be easily found by users

Can individual events of importance be constructed from the data?

  • The WellLine Knowledge Application and data model are centered around the concept of "events", simply defined as: "things that happen, of some importance"

    • A broad definition was chosen so as to permit a large variety of information to live in the product

    • Some examples of events:

      • Wellsite comment was recorded

      • Bottom hole pressure test occurred

      • Casing was set

      • Well-related document was written

      • Rig sensor exceeded a threshold

      • HSE incident occurred

      • Investigation into NPT was completed

      • Production metrics were gathered

      • Workover / Intervention efforts began

One rule of thumb for answering this question would be to evaluate another question:

"Taking into account other data sources being uploaded to WellLine, will the majority of this data potentially provide value or insight to a user?"

  • If the answer is "yes", then perhaps the entire data source can be included

  • If the answer is "no", then perhaps only a subset could be included

  • The idea of "importance" is particularly good to keep in mind when it comes to real-time or streaming data

    • For example, is every data point captured by every sensor providing value or insight, or are only specific points of data providing value/insight?

      • If the answer is "only specific points", then some pre-processing should be performed, ideally prior to data transformation

  • Keep in mind that unless the source data is already formatted as per the WellLine data model, events will always be "constructed", and how straightforward this process is depends upon the complexity of the source

These are guiding questions and are by no means exhaustive in their evaluation of data sources. With that in mind, if the answer to all of these is "yes" then the data source generally is a good candidate for transformation to the WellLine data model and subsequent upload to the TimeLine Service.