As covered in previous posts, the mission of EyesOnIt is to bring the power of Large Vision Models to users who may not have the data science or computer programming skills to work with Artificial Intelligence at a deep level. The EyesOnIt Docker image includes a web UI that allows users to apply our Large Vision Model to images and video just by using text prompts. The goal of this web UI is to provide a starting point so that all users can see results quickly. To keep the web UI simple and intuitive, it is intentionally limited. It lacks support for customizations that many users would need.
To support more advanced customization, EyesOnIt offers a REST API that allows programmers with limited experience to customize EyesOnIt for their own use cases. In fact, the Web UI mentioned above uses the EyesOnIt REST API to configure and control EyesOnIt. Through the REST API, programmers can develop extensive customizations and integrations which can:
Dynamically update object descriptions based on conditions such as time, weather, lighting, or location (for mobile cameras)
Programmatically update object descriptions in response to sensors including motion, proximity, access control or other sensors
Integrate EyesOnIt with Video Management Systems, databases, alerting systems, retail checkout or other systems
Implement custom logic for multi-layered detections, complex multi-sensor fusion, or alerting conditions with several criteria
The possibilities for what can be accomplished with EyesOnIt through custom code are quite extensive!
The REST API
The EyesOnIt REST API provides access to all the capabilities of the EyesOnIt Large Vision Model. The REST API also supports use of features layered on top of the Large Vision Model, including tile masking and SMS alerting. Since REST calls are supported by most programming languages, software developers can use the language of their choice to customize their use of EyesOnIt.
The endpoints supported by the REST API are outlined below, including the path, REST method, inputs, and outputs for each method call.
add_stream Method
POST /add_stream – add a stream that you would like to monitor to the EyesOnIt stream list
Inputs:
The RTSP URL of the stream to add
A friendly name for the stream
The frame rate for processing (default = 5)
The text descriptions of objects to look for in each frame and the alerting threshold for each object description
The tiling configuration, including the number of rows and columns, and which tiles should be processed
The alerting configuration, including the positive alert time, negative reset time, phone number and image alerting options
Outputs:
Success / failure (with an error message if unsuccessful)
monitor_stream Method
POST /monitor_stream – start monitoring a specified stream
Inputs:
The RTSP URL of the stream to monitor
The duration to monitor the stream (default = infinite)
Outputs:
Success / failure (with an error message if unsuccessful)
stop_monitoring Method
POST /stop_monitoring – stop monitoring a specified stream
Inputs:
The RTSP URL of the stream to stop monitor
Outputs:
Success / failure (with an error message if unsuccessful)
remove_stream Method
POST /remove_stream – remove a specified stream from the EyesOnIt stream list
Inputs:
The RTSP URL of the stream to remove
Outputs:
Success / failure (with an error message if unsuccessful)
get_video_frame Method
POST /get_video_frame – gets the latest video frame from a stream that is being monitored
Inputs:
The RTSP URL of the stream from which to get a frame
Outputs:
Success: the video frame is returned as a base64 encoded image
Failure: an error message is returned
get_preview_video_frame Method
POST /get_preview_video_frame – gets a video frame from a stream that is not in the EyesOnIt stream list
Inputs:
The RTSP URL of the stream from which to get a frame
Outputs:
Success: the video frame is returned as a base64 encoded image
Failure: an error message is returned
get_last_detection_info Method
POST /get_last_detection_info – gets data including the video frame for the last object detection
Inputs:
The RTSP URL of the stream for which to get the last detection information
Outputs:
Success: data about the last detection including:
The video frame when the detection occurred
The object description that triggered the detection
The confidence levels of all object descriptions when the detection occurred
The time of the detection
Failure: an error message is returned
get_streams_info Method
GET /get_streams_info – gets data about the streams in the EyesOnIt stream list
Inputs:
None – this method returns data for all streams in the EyesOnIt stream list
Outputs:
A JSON structure including one set of data for each known stream. The data for each stream includes:
The RTSP URL of the stream
The friendly name of the stream
The frame rate for processing
The object descriptions with the alerting threshold for each
The tiling configuration, including the number of rows and columns, and which tiles will be processed The alerting configuration, including the positive alert time, negative reset time, phone number and image alerting options Data about the last detection including the object description that triggered the detection, the confidence levels of all object descriptions and the time of the detection
process_image Method
POST /process_image – process a single image with EyesOnIt
Inputs:
The text descriptions of objects to look for in the image
The tiling configuration, including the number of rows and columns, and which tiles should be processed
The image to process specified as a base-64 encoded image string
Outputs:
A collection of object descriptions with their associated confidence levels
To get the full details for each of the above methods including the JSON format of inputs and outputs, run the EyesOnIt Docker image according to the instructions in the User Guide, and navigate to http://localhost:8000/docs in a browser (IP address and port may be different depending on how you ran the Docker image). The /docs page will provide all the details you need to start using the REST API.
Language-Specific Libraries
To simplify the use of the REST API, the EyesOnIt team is creating open-source language-specific libraries with methods that mirror the REST API methods and data structures that match the inputs and outputs for each method. The team is nearly done with the libraries for JavaScript, TypeScript, and C#. A Python library will be coming soon. Future blog posts will provide more details about how to use these libraries.
Conclusion
Large Vision Models are the future of computer vision. EyesOnIt makes Large Vision Models available to all users to process images and videos. While the EyesOnIt web UI provides basic capabilities, the EyesOnIt REST API allows programmers to customize and integrate EyesOnIt in extremely powerful ways. Please contact us and let us know how you are using the EyesOnIt REST API.