Posts

You no longer need to be a genius to do pattern matching

Tapping into life – An Introduction to Stream Analytics

Dear Readers,

Welcome to a new stream (no pun intended) on Red Mavericks articles. This time, we’ll be doing an introduction on Oracle’s new Stream Analytics.

We’ll be guiding you through this new, and very cool, product showing what it is and what it can do to leverage this largely untapped resource which is event stream analysis. In fact, streams are everywhere and are becoming more and more open and accessible. If you “wiretap” these, listen to them and understand the behavioral patterns , you can build extremely valuable applications that will help you deliver more to your customers.

It’s a whole new ball game. I hope you find this interesting.

What is Oracle Stream Analytics?

Oracle Stream Analytics (previously Oracle Stream Explorer) is, in fact, an application builder platform, focused on applications that process events coming from the most various systems, internal or external to the organization, thus enabling Business Insight information and deriving relevant data from these events.

Stream Analytics - Login Screen

Stream Analytics – Welcome to Fast Data Business Insight

It works using an Event Processing Engine to perform Fast Data Analysis over a large number of events that typically appear in a given timeframe.

It also provides a run-time platform that will allow you to run and manage the applications you built.

It’s not a new Oracle Event Processor. It uses OEP as the underlying Event Processing Engine (you can also use Apache Spark as a processing engine, if you prefer. More on this in other articles)

The real power in Oracle Stream Analytics is, curiously, in its UI. As an application builder, it went to great lengths to keep the UI really easy to use. The result is, in my view, very well achieved, with enough simplicity to allow that Business Users, provided they have a bit of technical knowledge, can actually build  applications on their own or with little help from the IT.

Concepts and Ideas

But to be able to build these applications, you must first understand the concepts and rules behind them. We’ll explain these by mixing real-life concepts and their representations on the platform (Oracle Stream Analytics). Let’s start by the main concepts…

Event

An Event is the representation of something that happened in a particular time. This is most important, as events must always be correlated with a notion of time, of when it happened.

Shape

A Shape is the data structure representation of an event. It describes the actual information structure of an event, to ensure at least a minimum of data coherence between events that represent the same occurrence type. If you have a bit of technical knowledge, try to think of the shapes as the XSD of the event.

Events that represent the same type of occurrence should use the same Shape. Events that represent different types of occurrences should use different shapes.

Stream

A Stream is a sequence of data elements (in this case Events) made available over time. These data elements have shapes that must be known before hand to allow proper processing. The easiest way to visualize a Stream is to think of a food processing plant conveyor belt transporting vegetables from one point to another inside the plant.

A Bell Pepper Stream

A Bell Pepper “Stream” – Photo by the US Department of Agriculture

As the vegetables go through the conveyor belt they will be made available at a given time at the output of the belt. This will be the point where the person or the system will collect the bell peppers and process them.

Source

A Source represents the system that is making a given stream available. Typically it represents a system that is producing its own data streams or “proxying” data streams from other systems. Stream Analytics will connect to Sources by making Connections to them.

Target

A Target is a channel to where Stream Analytics will send the result of the event processing work. A Target will connect downstream to other systems and will obey to a given Shape.

Exploration

An Exploration is Stream Analytics‘ way to process events. It allows for events to be filtered, combined and enriched with additional data, as well as allowing for event data manipulation and conversion, when suited, thus producing their own events which are the result of all of this processing.

Explorations can use other the product of other Explorations as their inputs, as well as Streams and Reference Data Tables (called simply References in Stream Analytics), which are used to enrich the Exploration outputs.

For instance, a Stream can contain the status of a given vending machine, identified by an internal vending machine ID, while its GPS coordinates are stored in a reference database table. This way, the vending machine doesn’t have to send the GPS coordinates every 5 seconds along with the status, as this information will not change frequently or by itself.

Pattern

A Pattern is, well… a pattern 🙂 a repetitive regularity that can be identified by some means.

Stream Analytics allow to create new Explorations based on given patterns such as trends over time, geospace boundary checks, Top/Bottom N matches, etc… and, if there are matches, pass these on to Targets.

Stream Analytics - Patterns Palette

Stream Analytics – Patterns Palette

Timeframe

A Timeframe defines the time window reference for a given Exploration event processing. Stream Analytics allow you to define two characteristics of the Timeframe:

  • Range – The universe of events that will be considered when making Exploration processing, for instance by using aggregate functions. In plain English, the range is used to limit the events considered calculating averages, max values or event counts (e.g: Nr of Events of type A happening in the last 30 minutes). As there can be too much events, it’s essential to have some kind of boundaries in which the analysis makes sense
    • If a sensor states that is operating below a given threshold, it’s important to know that it’s not a sporadic event that happens once a year, but something that is happening every 2 minutes in the last hour.
  • Eval. Frequency – The frequency in which the events are passed on to the Exploration. Sometimes, it’s important to collect the data from the Exploration inputs not at every milisecond, but in bigger intervals. This will stipulate the cadence at which a given exploration produces results (and thus pushes them to Targets)

Although some of these concepts may seem confusing and unclear, as we go through the next articles and use them, they’ll become second nature.

 

So that’s a wrap on this article.

On our next article, we’ll start building our example application. Be prepared to have some fun doing it. Until then…

Maverick (José Rodrigues)

Process Timers

Process Timers – Controlling the time in which your process executes

Hello everybody,

Following up a series of questions around setting timers in the Oracle Community forums, I decided to write this article to try and guide their use and how these can be used to control process execution.

Let’s start!

The Use Case

We’ll begin by setting up the scenario in which we’ll have to control our process flow.

Imagine that you want to have a part of your process that executes immediately if the current time is between 08:00am and 04:00pm (16:00 hours for us Europeans), or wait until 08:00am if it’s outside that interval.

It’s frequent to have some kind of control in parts of the processes, for instance when you want to send SMS to your customers. You certainly don’t want to do it at 03:00am.

How will we make this?

We should use a Catch Timer event, of course, and XPATH’s DateTime functions to check the current time and to set the timer to way for next morning’s 08:00.

The Catch Timer event has several ways to be configured (triggered at specific dates and times, on a specific schedule – every day at 10:28:00 (repeatable), or in a time cycle – every 2 minutes), but we’ll focus on the one where we configure the timer to wait for a specific time and date. More on the others perhaps in another article.

We’ll illustrate the use of timers with an example process. You can, of course, adapt it to your needs.

Defining the execution conditions 

So you start by defining a gateway that will split the execution between:

  • Immediate
  • Wait for 08:00am
    • This will have to be split into prior to midnight and after midnight. but for now, we’ll consider the scenario of only two options.

So, you set the expression on the conditional flow that will do the immediate execution, leaving the condition that must wait for 08:00 as the unconditional (default) branch.

The expression should be something like this:

Timer Setting

Timer Setting

xp20:hours-from-dateTime(xp20:current-dateTime()) >= 8  and xp20:hours-from-dateTime(xp20:current-dateTime()) <= 16

The function xp20:current-dateTime() gets the current Date and Time of when the decision is evaluated.

The function xp20:hours-from-dateTime(xs:dateTime) gets the ‘Hours’ integer from a dateTime object.

So you check if the current time is after 08:00am and before 04:00pm.

  • If it is, it follows the Green Light path, i.e. the immediate execution path.
  • If not, it will follow the Red Light path, and will wait on the timer for a green light (until 08:00am the next day, as per the requirements)

For the test process comprehension, check the process flow below.

Timer Test Process

Timer Test Process

So, only one more step to go: Setting the timer to the next 08:00am available.

This is achieved by setting the Timer implementation first to Type=Time Date (red arrow) and then setting the appropriate XPATH expression (orange arrow)

Timer Type Setting

Timer Type Setting

The XPATH expression is as follows:

xp20:add-dayTimeDuration-to-dateTime(xp20:current-date(),’P01DT08H’)

The add-dayTimeDuration-to-dateTime(xs:dateTime,formattingString) function adds an interval of date/time to a dateTime object.

The interval is set using the format ‘PyyYmmMddDThhHmmMssS‘.

The xp20:current-date() function returns the current date without the time associated, meaning it considers time = 00:00:00.

So, we’re stating that we want to add to the current date the amount of 01 day and 08 hours.

Warning

You would think that this solves the issue, but not quite. It solves the issue if the process reaches the decision point until midnight. After midnight, you can’t add a whole new day and then another 08 hours.

So you should split your flow further to handle these two scenarios:

  • Wait occurs prior to midnight => XPATH Expression interval = ‘P01DT08H’
  • Wait occurs after midnight => XPATH Expression interval = ‘P00DT08H”

I’m pretty sure there are other ways to do it, but I decided to do it like this:

Timer Process Final

Timer Process Final

In which I set the XPATH for Time Date type of the other Timer (Red Light / After Midnight) as

xp20:add-dayTimeDuration-to-dateTime(xp20:current-date(),’P00DT08H’)

So this should solve the case.

I added the project file for this:

CommunityTimerDefinition

Cheers

Maverick (José Rodrigues)

Post Header image by Henrique Simplicio