Sikuli's API and internal structure should be revised

Registered by RaiMan

Motivated through question

I think, that more than one year after the big step to the new API (9.x -> 10.x/X) and the implementation of many new features, it is time now to revise/consolidate the API and overall design.

observations from the standpoint of a power user (based on one year experience dealing with user questions):

-- the Jython and the Java layer are not synchronized
   - some methods only exist on one layer
   - many method signatures are different on the two layers
   - some features are implemented on the Jython level (so not available on the Java level)

possible enhancements looking top-down:

1 -- implementation of features should only take place on the Java level and should isolate all system dependencies (Java implementation level, the engine), the find(text)/OCR feature should be a separate engine jar.
2 -- the current Jython level should be reduced to interface only (no feature implementation) (Jython API, maps to/uses the engine)
3 -- the Jython script run feature should be separated from sikuli-script.jar in its own jar and have a Jython and a Java API (Jython script runner), should be configurable to use external Jython
4 -- there should be an additional Java API layer, that has the "same" classes/methods as the Jython layer and no implementation of features (Java API, maps to/uses the engine)
5 -- every support for any script running should be concentrated in sikuli-ide.jar (which uses the "3 -- Jython script runner"), it should have a feature to generate jar packages (all target systems) that can be run from commandline only needing Java being available.
6 -- the image capture/store feature and the preview feature should be available as a separate jar, so it can be used standalone or implemented into other IDE's/editors

with this concept, we would have a true engine and a true API, so both could be exchanged/enhanced separately from each other (e.g. the vnc challenge or Selenium integration). And we have some tools (IDE, script runner, capture support, preview), that could be used or not.

The distribution package should be one jar, that contains everything and a self running installer, that does everything needed on the respective system (like the Jython distribution).

there should be separate packages available (or configurable from full download)
- basic Jython (1, 2, 3, 6)
- basic Java (1, 4, 6)
- basic complete (1, 2, 3, 4, 6)
- IDE basic (1, 2, 3, 5)
- complete (1 - 6)

For each of these different jars (1 - 6) there should be 3 levels of developement (stable, bug fix/feature add, experimental) and separate build workflows.

The distribution packages should have a configurable build workflow to integrate private versions of the jars 2, 4 and 6 together with needed other components.

Based on all this it would easily be possible to implement JRuby scripting (exchange jar 2 and 3), a Scala API (exchange jar 4) or any other API for environments, that can talk directly to Java. Even true integration into other not-directly-Java-aware environments (Python, Perl, Ruby, C, ...) would be possible rather easy (the API level has to be implemented for the respective language and "connected" to the engine's Java API).

One could implement an optional API version (even configurable), that fully supports before/after-callbacks (generic and specific) for the relevant features.

And it is easily possible to implement support for changing/complementing the API (statically like today with extensions or even dynamically on the fly during run).

Based on jar 6, there would be more motivation, to implement Sikuli support into other IDE's or editors.

Blueprint information

Needs approval
Series goal:
Good progress
Milestone target:
milestone icon 1.0.0
Started by
Completed by

Related branches



New Design of Sikuli X

package: - computer vision code
org.sikuli.common - common utilities
org.sikuli.script.internal - internal implementation of sikuli script
org.sikuli.script - public API of sikuli script
org.sikuli.test - public API of sikuli test

org.sikuli.ide.common - common utilities of sikuli ide
org.sikuli.ide - sikuli ide

1. Internal Java Implementation - org.sikuli.script.internal
2. Public Java API - org.sikuli.script.{Region, Screen, App, ...}
    - Other languages on JVM (JRuby, Scala, Groovy, ...) can call this API directly.
3. Public Jython API - sikuli.{Sikuli, Region, Screen, ...}
    - thin layer. only handles Jython specific features, e.g. with, vdict.

sikuli-script-jython.jar (same as the old sikuli-script.jar)
1. everything in sikuli-script-core.jar
2. jython runtime and libs

1. useful functions in Sikuli IDE including screen capture, match testing, bundle management.

1. Sikuli IDE implementation
2. depends on sikuli-ide-lib.jar


Work Items

This blueprint contains Public information 
Everyone can see this information.