Skip to content

Stroke boundary detection

Mark Torrance edited this page Apr 11, 2016 · 1 revision

If you could hint me to the part of the code where you do the stroke separation, I could have a look at it and mybe try to implement it.

The method that detects stroke boundaries is in the openhandwrite/src/markwrite/markwrite/project.py file starting on line 618:

def findstrokes(self, searchsamplearray, obsolute_offset, parent_id):

This method is called once for each sample Series or Run detected in the data file when it is loaded. The first Stroke Detection app setting specifies whether strokes are detected by Series or Runs.

The method currently:

  1. Gets the xy velocity local minima for the samples within the searchsamplearray array of samples (which will be all the samples in one Series or Run of the data file).

  2. XY Velocity local minima are found using a 3rd party function called detect_peaks which is defined in the OpenHandWrite/src/markwrite/markwrite/sigproc/detect_peaks.py file. detect_peaks is suppose to reproduce the algorithm used by MatLab's findpeaks function.

  3. The sample positions returned by detect_peaks are used to create an array of stroke boundaries, which include the start and end index and time for each stroke detected.

  4. The first and last StBP found in the array are moved so they equal the first and last sample of the sample Run or Series that was passed to the findstrokes method.

If working on this code, these files may also be of interest:

  1. Pen Sample Filtering: Filtering of sample data is done once for each sample Series detected in the data, using a function filter_pen_sample_series(series). This function is located in the OpenHandWrite/src/markwrite/markwrite/sigproc/sample_filter.py file.

  2. Velocity /Accelleration calculation: Velocity & Acceleration measures are calculated for each sample Series detected in the data, using a function calculate_velocity(series). This function is located in the OpenHandWrite/src/markwrite/markwrite/sigproc/sample_va.py file.

I noticed that stroke separation for getwrite files is much better. I tried both versions of MarkWrite (within ohw and standalone) and found no differences.

You mean that the stroke segmentation for hdf5 files is better than txyp? If so, I would check if MarkWrite is sensibly splitting your data into Series. If there are way too many series than there should be, this could have a negative impact on stroke detection. If this is the case, change the max ISI app setting to a higher value than the default, which is calculated based on the ISI distribution of the pen data being loaded:

series_isi_thresh = np.percentile(sample_dts,99.0,interpolation='nearest')*2.5

For reference, the txyp file Series splitting is done in the OpenHandWrite/src/markwrite/markwrite/file_io.py file by the TabDelimitedDataImporter.postprocess() method; specifically on line 254 of the current code.

Clone this wiki locally