top of page

Unveiling EyesOnIt Version 2: Enhanced Scalability, Accuracy, and Programmability

EyesOnIt is very proud to announce the release of EyesOnIt Version 2. With Version 2, EyesOnIt has made significant improvements in the areas of scalability, accuracy, and programmability. The EyesOnIt Version 2 Docker image is now available on Docker Hub here.

Scalability Improvements

With Version 2, EyesOnIt has introduced a new processing pipeline optimized for efficiency and scalability, while preserving the power and flexibility of text-based object detection through the EyesOnIt Large Vision Model. As might be expected the EyesOnIt Large Vision Model is… large. The model consumes significant CPU and GPU resources each time it compares object descriptions to an image or video frame. Fortunately, in most scenarios, running the Large Vision Model on every frame of video is not necessary. Giving system designers and installers more control over when to run the Large Vision Model allows them to optimize the use of their server to reduce customer costs and maximize the efficiency of server resources.

These improved costs and efficiencies translate to more customers for system integrators as lower price points make advanced video security available to new market segments. New customers, existing customers and system integrators will all be more satisfied with the resulting solutions through the elimination of unnecessary costs. With the EyesOnIt container-based architecture, system integrators can easily design systems to process as many streams as the customer needs. It just costs less now!

Frame Rate

One Way EyesOnIt V2 increases efficiency and scalability by offering control over the processing frame rate. With compressed video streams, every frame must be decoded, but not every frame must be processed. EyesOnIt V2 lets the user set the processing frame rate from one frame per second up to the frame rate of the source video. Frames in between the processing interval are decoded and then discarded so they don’t consume unnecessary server resources. By controlling the frame rate of each stream, system integrators can reduce the cost of a new system or add new streams to an existing server without the purchase of additional hardware.

Motion Detection

EyesOnIt V2 introduces visual motion detection to determine whether the content of the video stream has changed. This motion detection runs very quickly and consumes minimal server resources. By including motion detection in the EyesOnIt processing pipeline, EyesOnIt gives system integrators much more control over when the Large Vision Model gets executed. System integrators can customize where in their video frame to check for motion, how often to check for motion, and how much motion requires a response. They can even configure EyesOnIt to run the Large Vision Model to check the video periodically in the absence of motion.

With these settings, system integrators can ensure that their server uses minimal resources on objects of no interest, while still accurately detecting larger objects of real interest. When the Large Vision Model does run, the evaluation of object descriptions provides accurate detection of the objects and conditions the customer cares about.

The images below show the motion detection settings added to EyesOnIt V2 and a visual depiction of a vehicle's motion across successive video frames.

Greater Detection Accuracy

Another important area of improvement in EyesOnIt V2 is object detection accuracy. This release gives users more intuitive control over the detection area and allows users to provide EyesOnIt with additional guidance to improve detection accuracy. EyesOnIt V2 also offers a traditional bounding box model as an additional stage in the pipeline to focus the processing of the large vision model.

Regions & Object Size

In EyesOnIt V2, the tiling approach from V1 has been replaced with a more intuitive method allowing users to designate the detection area accurately. In addition, the user can provide guidance regarding the expected size of the objects being detected. These new features, coupled with text-based object descriptions, give installers increased options to tell the EyesOnIt Large Vision Model what to detect.

Bounding Box Model

As an additional enhancement, EyesOnIt V2 adds a more traditional object detection model to the processing pipeline. This model is effective at detecting common objects such as people, vehicles, and bags. By configuring the optional step of detecting these common objects, installers can add another level of detection that can focus the large vision model more precisely on the area of interest. This improved focus leads to faster and more accurate processing by the large vision model.

The images below show the bounding box detection settings added to EyesOnIt V2, as well as a visual depiction of vehicle detection.

Enhanced Programmability

The final improvement in EyesOnIt V2 to mention here is the expansion of support for customization and integration through programmability. EyesOnIt V2 expands the existing REST API and adds SDKs for Python, C#, TypeScript and JavaScript. These developer resources allow programmers without detailed knowledge of computer vision to integrate the most advanced AI capabilities into their existing software and solutions.


The EyesOnIt REST API was introduced in V1 and has been expanded in V2. The REST API now supports all the features mentioned above. Programmers can specify options for frame rate, motion detection, detection regions, object size, and bounding box detection as they integrate EyesOnIt into their custom solutions. And, in addition to controlling EyesOnIt through REST calls, software that uses the REST API can receive notifications through REST calls as well. Programmers can now choose REST, WebSockets, and SMS as three mechanisms to receive alerts, thus expanding the ability to integrate EyesOnIt into new or existing software.


To further simplify the use of EyesOnIt for programmers using the most common languages, EyesOnIt V2 has introduced SDKs for Python, C#, TypeScript, and JavaScript. Programmers using these languages can use published packages from their familiar package managers and start using EyesOnIt in their favorite programming language with just a few lines of code. Each SDK includes documentation and sample code to simplify and expedite the path to successful object detection.


We hope you are as excited as we are about these big improvements to EyesOnIt. As always, please contact us at if you have any questions that we can help with. To try EyesOnIt with your own images, check out our free demo.



bottom of page