textInfos package
Framework for accessing text content in widgets. The core component of this framework is the L{TextInfo} class. In order to access text content for a widget, a L{TextInfo} implementation is required. A default implementation, L{NVDAObjects.NVDAObjectTextInfo}, is used to enable text review of information about a widget which does not have or support text content.
- class textInfos.Field
Bases:
dict
Provides information about a piece of text.
- class textInfos.FormatField
Bases:
Field
Provides information about the formatting of text; e.g. font information and hyperlinks.
- class textInfos.ControlField
Bases:
Field
Provides information about a control which encompasses text. For example, a piece of text might be contained within a table, button, form, etc. This field contains information about such a control, such as its role, name and description.
- PRESCAT_SINGLELINE = 'singleLine'
This field is usually a single line item; e.g. a link or heading.
- PRESCAT_MARKER = 'marker'
This field is a marker; e.g. a separator or footnote.
- PRESCAT_CONTAINER = 'container'
This field is a container, usually multi-line.
- PRESCAT_CELL = 'cell'
This field is a section of a larger container which is adjacent to another similar section; e.g. a table cell.
- PRESCAT_LAYOUT = None
This field is just for layout.
- getPresentationCategory(ancestors, formatConfig, reason=OutputReason.CARET, extraDetail=False)
- class textInfos.FieldCommand(command: str, field: ControlField | FormatField | None)
Bases:
object
A command indicating a L{Field} in a sequence of text and fields. When retrieving text with its associated fields, a L{TextInfo} provides a sequence of text strings and L{FieldCommand}s. A command indicates the start or end of a control or that the formatting of the text has changed.
Constructor. @param command: The command; one of:
“controlStart”, indicating the start of a L{ControlField}; “controlEnd”, indicating the end of a L{ControlField}; or “formatChange”, indicating a L{FormatField} change.
@param field: The field associated with this command; may be C{None} for controlEnd.
- class textInfos.Bookmark(*args, **kwargs)
Bases:
AutoPropertyObject
Represents a static absolute position in some text. This is used to construct a L{TextInfo} at an exact previously obtained position.
@param infoClass: The class of the L{TextInfo} object. @type infoClass: type; subclass of L{TextInfo} @param data: Data that can be used to reconstruct the position the textInfo object was in when it generated the bookmark.
- infoClass
The class of the L{TextInfo} object. @type: type; subclass of L{TextInfo}
- data
Data that can be used to reconstruct the position the textInfo object was in when it generated the bookmark.
- _abc_impl = <_abc._abc_data object>
- _propertyCache: Set[Callable[[AutoPropertyObject], Any]]
- textInfos._logBadSequenceTypes(sequence: List[Any | str], shouldRaise: bool = True)
- class textInfos.TextInfo(*args, **kwargs)
Bases:
AutoPropertyObject
Provides information about a range of text in an object and facilitates access to all text in the widget. A TextInfo represents a specific range of text, providing access to the text itself, as well as information about the text such as its formatting and any associated controls. This range can be moved within the object’s text relative to the initial position.
- At a minimum, subclasses must:
Extend the constructor so that it can set up the range at the specified position.
Implement the L{move}, L{expand}, L{compareEndPoints}, L{setEndPoint} and L{copy} methods.
Implement the L{text} and L{bookmark} attributes.
Support at least the L{UNIT_CHARACTER}, L{UNIT_WORD} and L{UNIT_LINE} units.
Support at least the L{POSITION_FIRST}, L{POSITION_LAST} and L{POSITION_ALL} positions.
If an implementation should support tracking with the mouse, L{Points} must be supported as a position. To support routing to a screen point from a given position, L{pointAtStart} or L{boundingRects} must be implemented. In order to support text formatting or control information, L{getTextWithFields} should be overridden.
@ivar bookmark: A unique identifier that can be used to make another textInfo object at this position. @type bookmark: L{Bookmark}
Constructor. Subclasses must extend this, calling the superclass method first. @param position: The initial position of this range; one of the POSITION_* constants or a position object supported by the implementation. @param obj: The object containing the range of text being represented.
- basePosition
The position with which this instance was constructed.
- property start: TextInfoEndpoint
Typing information for auto-property: start
- _get_start() TextInfoEndpoint
- _set_start(otherEndpoint: TextInfoEndpoint)
- property end: TextInfoEndpoint
Typing information for auto-property: end
- _get_end() TextInfoEndpoint
- _set_end(otherEndpoint: TextInfoEndpoint)
- obj: documentBase.TextContainerObject
Typing information for auto-property: _get_obj
- _get_obj() documentBase.TextContainerObject
The object containing the range of text being represented.
- _get_unit_mouseChunk()
- text: str
Typing information for auto-property: _get_text
- _abstract_text = True
- _get_text() str
The text with in this range. Subclasses must implement this. @return: The text. @note: The text is not guaranteed to be the exact length of the range in offsets.
- TextOrFieldsT
alias of
Union
[str
,FieldCommand
]
- TextWithFieldsT
alias of
List
[Union
[str
,FieldCommand
]]
- getTextWithFields(formatConfig: Dict | None = None) List[str | FieldCommand]
Retrieves the text in this range, as well as any control/format fields associated therewith. Subclasses may override this. The base implementation just returns the text. @param formatConfig: Document formatting configuration, useful if you wish to force a particular
configuration for a particular task.
@return: A sequence of text strings interspersed with associated field commands.
- _get_locationText()
A message that explains the location of the text position in friendly terms.
- _get_boundingRects()
Per line bounding rectangles for the visible text in this range. Implementations should ensure that the bounding rectangles don’t contain off screen coordinates. @rtype: [L{locationHelper.RectLTWH}] @raise NotImplementedError: If not supported. @raise LookupError: If not available (i.e. off screen, hidden, etc.)
- unitIndex(unit: str) int
@param unit: a unit constant for which you want to retrieve an index @returns: The 1-based index of this unit, out of all the units of this type in the object
- unitCount(unit)
@param unit: a unit constant @type unit: string @returns: the number of units of this type in the object @rtype: int
- abstract compareEndPoints(other, which)
compares one end of this range to one end of another range. Subclasses must implement this. @param other: the text range to compare with. @type other: L{TextInfo} @param which: The ends to compare; one of “startToStart”, “startToEnd”, “endToStart”, “endToEnd”. @return: -1 if this end is before other end, 1 if this end is after other end or 0 if this end and other end are the same. @rtype: int
- isOverlapping(other)
Determines whether this object overlaps another object in any way. Note that collapsed objects can cause some confusion. For example, in terms of offsets, (4, 4) and (4, 5) are not considered as overlapping. Therefore, collapsed objects should probably be expanded to at least 1 character when using this method. @param other: The TextInfo object being compared. @type other: L{TextInfo} @return: C{True} if the objects overlap, C{False} if not. @rtype: bool
- abstract setEndPoint(other, which)
Sets one end of this range to one end of another range. Subclasses must implement this. @param other: The range from which an end is being obtained. @type other: L{TextInfo} @param which: The ends to use; one of “startToStart”, “startToEnd”, “endToStart”, “endToEnd”.
- _get_isCollapsed()
@return: C{True} if representing a collapsed range, C{False} if the range is expanded to cover one or more characters. @rtype: bool
- abstract expand(unit)
Expands the start and end of this text info object to a given unit @param unit: a unit constant @type unit: string
- collapse(end=False)
Collapses this text info object so that both endpoints are the same. @param end: Whether to collapse to the end; C{True} to collapse to the end, C{False} to collapse to the start. @type end: bool
- abstract copy()
duplicates this text info object so that changes can be made to either one with out afecting the other
- updateCaret()
Moves the system caret to the position of this text info object
- updateSelection()
Moves the selection (usually the system caret) to the position of this text info object
- _abstract_bookmark = True
- _get_bookmark()
- abstract move(unit, direction, endPoint=None)
Moves one or both of the endpoints of this object by the given unit and direction. @param unit: the unit to move by; one of the UNIT_* constants. @param direction: a positive value moves forward by a number of units, a negative value moves back a number of units @type: int @param endPoint: Either None, “start” or “end”. If “start” then the start of the range is moved, if “end” then the end of the range is moved, if None - not specified then collapse to start and move both start and end. @return: The number of units moved;
negative indicates backward movement, positive indicates forward movement, 0 means no movement.
@rtype: int
- find(text, caseSensitive=False, reverse=False)
Locates the given text and positions this TextInfo object at the start. @param text: the text to search for @type text: string @param caceSensitive: true if case sensitivity search should be used, False if not @type caseSensitive: bool @param reverse: true then the search will go from current position towards the start of the text, if false then towards the end. @type reverse: bool @returns: True if text is found, false otherwise @rtype: bool
- NVDAObjectAtStart: NVDAObjects.NVDAObject
Typing information for auto-property: _get_NVDAObjectAtStart
- _get_NVDAObjectAtStart() NVDAObjects.NVDAObject
Get the NVDAObject related to the start of the range. Usually it is just the owner NVDAObject, but in the case of virtualBuffers it may be a descendant object. @returns: the NVDAObject at the start
- _get_focusableNVDAObjectAtStart()
retreaves the deepest focusable NVDAObject related to the start of the range. Usually it is just the owner NVDAObject, but in the case of virtualBuffers it may be a descendant object. @returns: the NVDAObject at the start
- _get_pointAtStart()
Retrieves x and y coordinates corresponding with the textInfo start. It should return Point. The base implementation uses L{boundingRects}. @rtype: L{locationHelper.Point}
- _get_clipboardText()
Text suitably formatted for copying to the clipboard. E.g. crlf characters inserted between lines.
- copyToClipboard(notify=False)
Copy the content of this instance to the clipboard. @return: C{True} if successful, C{False} otherwise. @rtype: bool @param notify: whether to emit a confirmation message @type notify: boolean
- getTextInChunks(unit)
Retrieve the text of this instance in chunks of a given unit. @param unit: The unit at which chunks should be split. @return: Chunks of text. @rtype: generator of str
- getControlFieldSpeech(attrs: ControlField, ancestorAttrs: List[Field], fieldType: str, formatConfig: Dict[str, bool] | None = None, extraDetail: bool = False, reason: OutputReason | None = None) List[Any | str]
- getControlFieldBraille(field, ancestors, reportStart, formatConfig)
- getFormatFieldSpeech(attrs: Field, attrsCache: Field | None = None, formatConfig: Dict[str, bool] | None = None, reason: OutputReason | None = None, unit: str | None = None, extraDetail: bool = False, initialFormat: bool = False) List[Any | str]
Get the spoken representation for given format information. The base implementation just calls L{speech.getFormatFieldSpeech}. This can be extended in order to support implementation specific attributes. If extended, the superclass should be called first.
- activate()
Activate this position. For example, this might activate the object at this position or click the point at this position. @raise NotImplementedError: If not supported.
- getMathMl(field)
Get MathML for a math control field. This will only be called for control fields with a role of L{controlTypes.Role.MATH}. @raise LookupError: If MathML can’t be retrieved for this field.
- _getTextForCodepointMovement() str
Gets the text as used in moveToCodepointOffset.
- moveToCodepointOffset(codepointOffset: int) Self
This function moves textInfos by codepoint characters. A codepoint character represents exactly 1 character in a Pythonic string.
- Illustration:
Suppose we have TextInfo that represents a paragraph of text: ``` > s = paragraphInfo.text > s ‘Hello, world!
- ‘
` Suppose that we would like to put the cursor at the first letter of the word 'world'. That means jumping to index 7: `
> s[7:] ‘world!
- ‘
` Here is how this can be done: `
> info = paragraphInfo.moveToCodepointOffset(7) > info.setEndPoint(paragraphInfo, “endToEnd”) > info.text ‘world!
- ‘
- Background:
In many applications there is no one-to-one mapping of codepoint characters and TextInfo characters, e.g. when calling TextInfo.move(UNIT_CHARACTER, n). There are a couple of reasons for this discrepancy: 1. In Wide character encoding, some 4-byte unicode characters are represented as two surrogate characters, whereas in Pythonic string they would be represented by a single character. 2. In non-offset TextInfos (e.g. UIATextInfo) there is no guarantee on the fact that TextInfos.move(UNIT_CHARACTER, 1)would actually move by exactly 1 character. A good illustration of this is in Microsoft Word with UIA enabled always, the first character of a bullet list item would be represented by three pythonic codepoint characters: * Bullet character “•” * Tab character * And the first character of of list item per se.
In many use cases (e.g., sentence navigation, style navigation), we identify pythonic codepoint character that we would like to move our TextInfo to. TextInfos.move(UNIT_CHARACTER, n) would cause many side effects. This function provides a clean and reliable way to jump to a given codepoint offset.
- Assumptions:
1. This function operates on a non-collapsed TextInfo only. In a typical scenario, we might want to jump to a certain offset within a paragraph or a line. In this case this function should be called on TextInfo representing said paragraph or line. The reason for that is that for some implementations we might need to access text of paragraph/line in order to accurately compute result offset. 2. It assumes that 1 character of application-specific TextInfo representation maps to 1 or more characters of codepoint representation. 3. This function is also written with an assumption that a character in application-specific TextInfo representation might not map to any pythonic characters, although this scenario has never been observed in any applications. 4. Also this function assumes that most characters have 1:1 mapping between codepoint and application-specific representations. This assumption is not required, however if this assumption is True, the function will converge faster. If this assumption is false, then it might take many iterations to find the right TextInfo.
- Algorithm:
This generic implementation essentially a biased binary search. On every iteration we operate on a pythonic string and its TextInfo counterpart stored in info variable. We would like to reach a certain offset within that pythonic string, that is stored in codepointOffsetLeft variable. In every iteration of the loop: 1. We try to either move from the left end of info by codepointOffsetLeft characters or from the right end by -codepointOffsetRight characters - depending which move is shorter. We store destination point as collapsed TextInfo tmpInfo. 2. We compute number of pythonic characters from the beginning of info until tmpInfo and store it in actualCodepointOffset variable. 3. We will compare actualCodepointOffset with codepointOffsetLeft : if they are equal, then we just found desired TextInfo. Otherwise we use tmpInfo as the middle point of binary search and we recurse either to the left or to the right, depending where desired offset lies.
One extra part of the algorithm serves to prevent certain conditions: if we happen to move on the step 1 from the same point twice in two consecutive iterations of the loop, then on the second time we will move tmpInfo exactly to the opposite end of info, and the algorithm will fail on sanity check condition in the for loop. To avoid this situation we track last move and the direction of last divide in variables lastMove and lastRecursed. If we detect that we are about to move from the same endpoint for the second time, we reduce the count of characters in order to make sure the algorithm makes some progress on each iteration.
- _abc_impl = <_abc._abc_data object>
- bookmark
- boundingRects
- clipboardText
- focusableNVDAObjectAtStart
- isCollapsed
- location
- locationText
- pointAtStart
- unit_mouseChunk
- textInfos.convertToCrlf(text)
Convert a string so that it contains only CRLF line endings. @param text: The text to convert. @type text: str @return: The converted text. @rtype: str
- class textInfos.DocumentWithPageTurns(*args, **kwargs)
Bases:
ScriptableObject
A document which supports multiple pages of text, but only exposes one page at a time.
- turnPage(previous=False)
Switch to the next/previous page of text. @param previous: C{True} to turn to the previous page, C{False} to turn to the next. @type previous: bool @raise RuntimeError: If there are no further pages.
- _abc_impl = <_abc._abc_data object>
- _propertyCache: Set[Callable[[AutoPropertyObject], Any]]
- class textInfos.TextInfoEndpoint(textInfo: TextInfo, isStart: bool)
Bases:
object
Represents one end of a TextInfo instance. This object can be compared with another end from the same or a different TextInfo instance, Using the standard math comparison operators: < <= == != >= >
@param textInfo: the TextInfo instance you wish to represent an endpoint of. @param isStart: true to represent the start, false for the end.
- _whichMap: Dict[Tuple[bool, bool], str] = {(False, False): 'endToEnd', (False, True): 'endToStart', (True, False): 'startToEnd', (True, True): 'startToStart'}
- _cmp(other: TextInfoEndpoint) int
A standard cmp function returning: -1 for less than, 0 for equal and 1 for greater than.
- moveTo(other: TextInfoEndpoint) None
Moves the end of the TextInfo this endpoint represents to the position of the given endpoint.
- class textInfos.CommentType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
a value exposed by the ‘comment’ key of a L{Formatfield}.
- GENERAL = 'general'
- DRAFT = 'draft'
- RESOLVED = 'resolved'
Submodules
textInfos.offsets module
- class textInfos.offsets.Offsets(startOffset: int, endOffset: int)
Bases:
object
Represents two offsets.
- startOffset: int
the first offset.
- endOffset: int
the second offset.
- textInfos.offsets.findStartOfLine(text, offset, lineLength=None)
Searches backwards through the given text from the given offset, until it finds the offset that is the start of the line. With out a set line length, it searches for new line / cariage return characters, with a set line length it simply moves back to sit on a multiple of the line length. @param text: the text to search @type text: str @param offset: the offset of the text to start at @type offset: int @param lineLength: The number of characters that makes up a line, None if new line characters should be looked at instead @type lineLength: int or None @return: the found offset @rtype: int
- textInfos.offsets.findEndOfLine(text, offset, lineLength=None)
Searches forwards through the given text from the given offset, until it finds the offset that is the start of the next line. With out a set line length, it searches for new line / cariage return characters, with a set line length it simply moves forward to sit on a multiple of the line length. @param text: the text to search @type text: str @param offset: the offset of the text to start at @type offset: int @param lineLength: The number of characters that makes up a line, None if new line characters should be looked at instead @type lineLength: int or None @return: the found offset @rtype: int
- textInfos.offsets.findStartOfWord(text, offset, lineLength=None)
Searches backwards through the given text from the given offset, until it finds the offset that is the start of the word. It checks to see if a character is alphanumeric, or is another symbol , or is white space. @param text: the text to search @type text: str @param offset: the offset of the text to start at @type offset: int @param lineLength: The number of characters that makes up a line, None if new line characters should be looked at instead @type lineLength: int or None @return: the found offset @rtype: int
- textInfos.offsets.findEndOfWord(text, offset, lineLength=None)
Searches forwards through the given text from the given offset, until it finds the offset that is the start of the next word. It checks to see if a character is alphanumeric, or is another symbol , or is white space. @param text: the text to search @type text: str @param offset: the offset of the text to start at @type offset: int @param lineLength: The number of characters that makes up a line, None if new line characters should be looked at instead @type lineLength: int or None @return: the found offset @rtype: int
- class textInfos.offsets.OffsetsTextInfo(*args, **kwargs)
Bases:
TextInfo
An abstract TextInfo for text implementations which represent ranges using numeric offsets relative to the start of the text. In such implementations, the start of the text is represented by 0 and the end is the length of the entire text.
All subclasses must implement L{_getStoryLength}. Aside from this, there are two possible implementations:
If the underlying text implementation does not support retrieval of line offsets, L{_getStoryText} should be implemented.
In this case, the base implementation of L{_getLineOffsets} will retrieve the entire text of the object and use text searching algorithms to find line offsets. This is very inefficient and should be avoided if possible. * Otherwise, subclasses must implement at least L{_getTextRange} and L{_getLineOffsets}. Retrieval of other offsets (e.g. L{_getWordOffsets}) should also be implemented if possible for greatest accuracy and efficiency.
If a caret and/or selection should be supported, L{_getCaretOffset} and/or L{_getSelectionOffsets} should be implemented, respectively. To support conversion from screen points (e.g. for mouse tracking), L{_getOffsetFromPoint} should be implemented. To support conversion to screen rectangles and points (e.g. for magnification or mouse tracking), either L{_getBoundingRectFromOffset} or L{_getPointFromOffset} should be implemented. Note that the base implementation of L{_getPointFromOffset} uses L{_getBoundingRectFromOffset}.
Constructor. Subclasses may extend this to perform implementation specific initialisation, calling their superclass method afterwards.
- detectFormattingAfterCursorMaybeSlow: bool = True
Honours documentFormatting config option if true - set to false if this is not at all slow.
- useUniscribe: bool = True
Use uniscribe to calculate word offsets etc.
- encoding: str | None = 'utf_16_le'
The encoding internal to the underlying text info implementation.
- _get_locationText()
A message that explains the location of the text position in friendly terms.
- _get_boundingRects() List[RectLTWH]
Per line bounding rectangles for the visible text in this range. Implementations should ensure that the bounding rectangles don’t contain off screen coordinates. @rtype: [L{locationHelper.RectLTWH}] @raise NotImplementedError: If not supported. @raise LookupError: If not available (i.e. off screen, hidden, etc.)
- _getCaretOffset()
- _setCaretOffset(offset)
- _getSelectionOffsets()
- _setSelectionOffsets(start, end)
- abstract _getStoryLength()
- _getStoryText()
Retrieve the entire text of the object. @return: The entire text of the object. @rtype: str
- _getTextRange(start, end)
Retrieve the text in a given offset range. @param start: The start offset. @type start: int @param end: The end offset (exclusive). @type end: int @return: The text contained in the requested range. @rtype: str
- _getFormatFieldAndOffsets(offset, formatConfig, calculateOffsets=True)
Retrieve the formatting information for a given offset and the offsets spanned by that field. Subclasses must override this if support for text formatting is desired. The base implementation associates text with line numbers if possible.
- _calculateUniscribeOffsets(lineText: str, unit: str, relOffset: int) Tuple[int, int] | None
Calculates the bounds of a unit at an offset within a given string of text using the Windows uniscribe library, also used in Notepad, for example. Units supported are character and word. @param lineText: the text string to analyze @param unit: the TextInfo unit (character or word) @param relOffset: the character offset within the text string at which to calculate the bounds.
- _getCharacterOffsets(offset)
- _getWordOffsets(offset)
- _getLineNumFromOffset(offset)
- _getLineOffsets(offset)
- _getParagraphOffsets(offset)
- _getReadingChunkOffsets(offset)
- _getBoundingRectFromOffset(offset)
- _getPointFromOffset(offset)
- _getOffsetFromPoint(x, y)
- _getNVDAObjectFromOffset(offset)
- _getOffsetsFromNVDAObject(obj)
- _get_NVDAObjectAtStart()
Get the NVDAObject related to the start of the range. Usually it is just the owner NVDAObject, but in the case of virtualBuffers it may be a descendant object. @returns: the NVDAObject at the start
- _getUnitOffsets(unit, offset)
- _get_pointAtStart()
Retrieves x and y coordinates corresponding with the textInfo start. It should return Point. The base implementation uses L{boundingRects}. @rtype: L{locationHelper.Point}
- _get_isCollapsed()
@return: C{True} if representing a collapsed range, C{False} if the range is expanded to cover one or more characters. @rtype: bool
- collapse(end=False)
Collapses this text info object so that both endpoints are the same. @param end: Whether to collapse to the end; C{True} to collapse to the end, C{False} to collapse to the start. @type end: bool
- expand(unit)
Expands the start and end of this text info object to a given unit @param unit: a unit constant @type unit: string
- copy()
duplicates this text info object so that changes can be made to either one with out afecting the other
- compareEndPoints(other, which)
compares one end of this range to one end of another range. Subclasses must implement this. @param other: the text range to compare with. @type other: L{TextInfo} @param which: The ends to compare; one of “startToStart”, “startToEnd”, “endToStart”, “endToEnd”. @return: -1 if this end is before other end, 1 if this end is after other end or 0 if this end and other end are the same. @rtype: int
- setEndPoint(other, which)
Sets one end of this range to one end of another range. Subclasses must implement this. @param other: The range from which an end is being obtained. @type other: L{TextInfo} @param which: The ends to use; one of “startToStart”, “startToEnd”, “endToStart”, “endToEnd”.
- getTextWithFields(formatConfig: Dict | None = None) List[str | FieldCommand]
Retrieves the text in this range, as well as any control/format fields associated therewith. Subclasses may override this. The base implementation just returns the text. @param formatConfig: Document formatting configuration, useful if you wish to force a particular
configuration for a particular task.
@return: A sequence of text strings interspersed with associated field commands.
- _get_text()
The text with in this range. Subclasses must implement this. @return: The text. @note: The text is not guaranteed to be the exact length of the range in offsets.
- unitIndex(unit)
@param unit: a unit constant for which you want to retrieve an index @returns: The 1-based index of this unit, out of all the units of this type in the object
- unitCount(unit)
@param unit: a unit constant @type unit: string @returns: the number of units of this type in the object @rtype: int
- NVDAObjectAtStart: NVDAObjects.NVDAObject
Typing information for auto-property: _get_NVDAObjectAtStart
- _abc_impl = <_abc._abc_data object>
- allowMoveToOffsetPastEnd = True
move with unit_character can move 1 past story length to allow braille routing to end insertion point. (#2096)
- bookmark
- boundingRects
- isCollapsed
- locationText
- pointAtStart
- text: str
Typing information for auto-property: _get_text
- move(unit, direction, endPoint=None)
Moves one or both of the endpoints of this object by the given unit and direction. @param unit: the unit to move by; one of the UNIT_* constants. @param direction: a positive value moves forward by a number of units, a negative value moves back a number of units @type: int @param endPoint: Either None, “start” or “end”. If “start” then the start of the range is moved, if “end” then the end of the range is moved, if None - not specified then collapse to start and move both start and end. @return: The number of units moved;
negative indicates backward movement, positive indicates forward movement, 0 means no movement.
@rtype: int
- find(text, caseSensitive=False, reverse=False)
Locates the given text and positions this TextInfo object at the start. @param text: the text to search for @type text: string @param caceSensitive: true if case sensitivity search should be used, False if not @type caseSensitive: bool @param reverse: true then the search will go from current position towards the start of the text, if false then towards the end. @type reverse: bool @returns: True if text is found, false otherwise @rtype: bool
- updateCaret()
Moves the system caret to the position of this text info object
- updateSelection()
Moves the selection (usually the system caret) to the position of this text info object
- _get_bookmark()
- _getFirstVisibleOffset()
- _getLastVisibleOffset()
- _getOffsetEncoder()
- moveToCodepointOffset(codepointOffset: int) Self
This function moves textInfos by codepoint characters. A codepoint character represents exactly 1 character in a Pythonic string.
- Illustration:
Suppose we have TextInfo that represents a paragraph of text: ``` > s = paragraphInfo.text > s ‘Hello, world!
- ‘
` Suppose that we would like to put the cursor at the first letter of the word 'world'. That means jumping to index 7: `
> s[7:] ‘world!
- ‘
` Here is how this can be done: `
> info = paragraphInfo.moveToCodepointOffset(7) > info.setEndPoint(paragraphInfo, “endToEnd”) > info.text ‘world!
- ‘
- Background:
In many applications there is no one-to-one mapping of codepoint characters and TextInfo characters, e.g. when calling TextInfo.move(UNIT_CHARACTER, n). There are a couple of reasons for this discrepancy: 1. In Wide character encoding, some 4-byte unicode characters are represented as two surrogate characters, whereas in Pythonic string they would be represented by a single character. 2. In non-offset TextInfos (e.g. UIATextInfo) there is no guarantee on the fact that TextInfos.move(UNIT_CHARACTER, 1)would actually move by exactly 1 character. A good illustration of this is in Microsoft Word with UIA enabled always, the first character of a bullet list item would be represented by three pythonic codepoint characters: * Bullet character “•” * Tab character * And the first character of of list item per se.
In many use cases (e.g., sentence navigation, style navigation), we identify pythonic codepoint character that we would like to move our TextInfo to. TextInfos.move(UNIT_CHARACTER, n) would cause many side effects. This function provides a clean and reliable way to jump to a given codepoint offset.
- Assumptions:
1. This function operates on a non-collapsed TextInfo only. In a typical scenario, we might want to jump to a certain offset within a paragraph or a line. In this case this function should be called on TextInfo representing said paragraph or line. The reason for that is that for some implementations we might need to access text of paragraph/line in order to accurately compute result offset. 2. It assumes that 1 character of application-specific TextInfo representation maps to 1 or more characters of codepoint representation. 3. This function is also written with an assumption that a character in application-specific TextInfo representation might not map to any pythonic characters, although this scenario has never been observed in any applications. 4. Also this function assumes that most characters have 1:1 mapping between codepoint and application-specific representations. This assumption is not required, however if this assumption is True, the function will converge faster. If this assumption is false, then it might take many iterations to find the right TextInfo.
- Algorithm:
This generic implementation essentially a biased binary search. On every iteration we operate on a pythonic string and its TextInfo counterpart stored in info variable. We would like to reach a certain offset within that pythonic string, that is stored in codepointOffsetLeft variable. In every iteration of the loop: 1. We try to either move from the left end of info by codepointOffsetLeft characters or from the right end by -codepointOffsetRight characters - depending which move is shorter. We store destination point as collapsed TextInfo tmpInfo. 2. We compute number of pythonic characters from the beginning of info until tmpInfo and store it in actualCodepointOffset variable. 3. We will compare actualCodepointOffset with codepointOffsetLeft : if they are equal, then we just found desired TextInfo. Otherwise we use tmpInfo as the middle point of binary search and we recurse either to the left or to the right, depending where desired offset lies.
One extra part of the algorithm serves to prevent certain conditions: if we happen to move on the step 1 from the same point twice in two consecutive iterations of the loop, then on the second time we will move tmpInfo exactly to the opposite end of info, and the algorithm will fail on sanity check condition in the for loop. To avoid this situation we track last move and the direction of last divide in variables lastMove and lastRecursed. If we detect that we are about to move from the same endpoint for the second time, we reduce the count of characters in order to make sure the algorithm makes some progress on each iteration.