Struct GraphemeCursor
pub struct GraphemeCursor { /* private fields */ }
dep_unicode_segmentation
only.Expand description
Cursor-based segmenter for grapheme clusters.
This allows working with ropes and other datastructures where the string is not contiguous or fully known at initialization time.
Implementations§
§impl GraphemeCursor
impl GraphemeCursor
pub fn new(offset: usize, len: usize, is_extended: bool) -> GraphemeCursor
pub fn new(offset: usize, len: usize, is_extended: bool) -> GraphemeCursor
Create a new cursor. The string and initial offset are given at creation
time, but the contents of the string are not. The is_extended
parameter
controls whether extended grapheme clusters are selected.
The offset
parameter must be on a codepoint boundary.
let s = "हिन्दी";
let mut legacy = GraphemeCursor::new(0, s.len(), false);
assert_eq!(legacy.next_boundary(s, 0), Ok(Some("ह".len())));
let mut extended = GraphemeCursor::new(0, s.len(), true);
assert_eq!(extended.next_boundary(s, 0), Ok(Some("हि".len())));
pub fn set_cursor(&mut self, offset: usize)
pub fn set_cursor(&mut self, offset: usize)
Set the cursor to a new location in the same string.
let s = "abcd";
let mut cursor = GraphemeCursor::new(0, s.len(), false);
assert_eq!(cursor.cur_cursor(), 0);
cursor.set_cursor(2);
assert_eq!(cursor.cur_cursor(), 2);
pub fn cur_cursor(&self) -> usize ⓘ
pub fn cur_cursor(&self) -> usize ⓘ
The current offset of the cursor. Equal to the last value provided to
new()
or set_cursor()
, or returned from next_boundary()
or
prev_boundary()
.
// Two flags (🇷🇸🇮🇴), each flag is two RIS codepoints, each RIS is 4 bytes.
let flags = "\u{1F1F7}\u{1F1F8}\u{1F1EE}\u{1F1F4}";
let mut cursor = GraphemeCursor::new(4, flags.len(), false);
assert_eq!(cursor.cur_cursor(), 4);
assert_eq!(cursor.next_boundary(flags, 0), Ok(Some(8)));
assert_eq!(cursor.cur_cursor(), 8);
pub fn provide_context(&mut self, chunk: &str, chunk_start: usize)
pub fn provide_context(&mut self, chunk: &str, chunk_start: usize)
Provide additional pre-context when it is needed to decide a boundary.
The end of the chunk must coincide with the value given in the
GraphemeIncomplete::PreContext
request.
let flags = "\u{1F1F7}\u{1F1F8}\u{1F1EE}\u{1F1F4}";
let mut cursor = GraphemeCursor::new(8, flags.len(), false);
// Not enough pre-context to decide if there's a boundary between the two flags.
assert_eq!(cursor.is_boundary(&flags[8..], 8), Err(GraphemeIncomplete::PreContext(8)));
// Provide one more Regional Indicator Symbol of pre-context
cursor.provide_context(&flags[4..8], 4);
// Still not enough context to decide.
assert_eq!(cursor.is_boundary(&flags[8..], 8), Err(GraphemeIncomplete::PreContext(4)));
// Provide additional requested context.
cursor.provide_context(&flags[0..4], 0);
// That's enough to decide (it always is when context goes to the start of the string)
assert_eq!(cursor.is_boundary(&flags[8..], 8), Ok(true));
pub fn is_boundary(
&mut self,
chunk: &str,
chunk_start: usize,
) -> Result<bool, GraphemeIncomplete> ⓘ
pub fn is_boundary( &mut self, chunk: &str, chunk_start: usize, ) -> Result<bool, GraphemeIncomplete> ⓘ
Determine whether the current cursor location is a grapheme cluster boundary.
Only a part of the string need be supplied. If chunk_start
is nonzero or
the length of chunk
is not equal to len
on creation, then this method
may return GraphemeIncomplete::PreContext
. The caller should then
call provide_context
with the requested chunk, then retry calling this
method.
For partial chunks, if the cursor is not at the beginning or end of the string, the chunk should contain at least the codepoint following the cursor. If the string is nonempty, the chunk must be nonempty.
All calls should have consistent chunk contents (ie, if a chunk provides content for a given slice, all further chunks covering that slice must have the same content for it).
let flags = "\u{1F1F7}\u{1F1F8}\u{1F1EE}\u{1F1F4}";
let mut cursor = GraphemeCursor::new(8, flags.len(), false);
assert_eq!(cursor.is_boundary(flags, 0), Ok(true));
cursor.set_cursor(12);
assert_eq!(cursor.is_boundary(flags, 0), Ok(false));
pub fn next_boundary(
&mut self,
chunk: &str,
chunk_start: usize,
) -> Result<Option<usize>, GraphemeIncomplete> ⓘ
pub fn next_boundary( &mut self, chunk: &str, chunk_start: usize, ) -> Result<Option<usize>, GraphemeIncomplete> ⓘ
Find the next boundary after the current cursor position. Only a part of
the string need be supplied. If the chunk is incomplete, then this
method might return GraphemeIncomplete::PreContext
or
GraphemeIncomplete::NextChunk
. In the former case, the caller should
call provide_context
with the requested chunk, then retry. In the
latter case, the caller should provide the chunk following the one
given, then retry.
See is_boundary
for expectations on the provided chunk.
let flags = "\u{1F1F7}\u{1F1F8}\u{1F1EE}\u{1F1F4}";
let mut cursor = GraphemeCursor::new(4, flags.len(), false);
assert_eq!(cursor.next_boundary(flags, 0), Ok(Some(8)));
assert_eq!(cursor.next_boundary(flags, 0), Ok(Some(16)));
assert_eq!(cursor.next_boundary(flags, 0), Ok(None));
And an example that uses partial strings:
let s = "abcd";
let mut cursor = GraphemeCursor::new(0, s.len(), false);
assert_eq!(cursor.next_boundary(&s[..2], 0), Ok(Some(1)));
assert_eq!(cursor.next_boundary(&s[..2], 0), Err(GraphemeIncomplete::NextChunk));
assert_eq!(cursor.next_boundary(&s[2..4], 2), Ok(Some(2)));
assert_eq!(cursor.next_boundary(&s[2..4], 2), Ok(Some(3)));
assert_eq!(cursor.next_boundary(&s[2..4], 2), Ok(Some(4)));
assert_eq!(cursor.next_boundary(&s[2..4], 2), Ok(None));
pub fn prev_boundary(
&mut self,
chunk: &str,
chunk_start: usize,
) -> Result<Option<usize>, GraphemeIncomplete> ⓘ
pub fn prev_boundary( &mut self, chunk: &str, chunk_start: usize, ) -> Result<Option<usize>, GraphemeIncomplete> ⓘ
Find the previous boundary after the current cursor position. Only a part
of the string need be supplied. If the chunk is incomplete, then this
method might return GraphemeIncomplete::PreContext
or
GraphemeIncomplete::PrevChunk
. In the former case, the caller should
call provide_context
with the requested chunk, then retry. In the
latter case, the caller should provide the chunk preceding the one
given, then retry.
See is_boundary
for expectations on the provided chunk.
let flags = "\u{1F1F7}\u{1F1F8}\u{1F1EE}\u{1F1F4}";
let mut cursor = GraphemeCursor::new(12, flags.len(), false);
assert_eq!(cursor.prev_boundary(flags, 0), Ok(Some(8)));
assert_eq!(cursor.prev_boundary(flags, 0), Ok(Some(0)));
assert_eq!(cursor.prev_boundary(flags, 0), Ok(None));
And an example that uses partial strings (note the exact return is not
guaranteed, and may be PrevChunk
or PreContext
arbitrarily):
let s = "abcd";
let mut cursor = GraphemeCursor::new(4, s.len(), false);
assert_eq!(cursor.prev_boundary(&s[2..4], 2), Ok(Some(3)));
assert_eq!(cursor.prev_boundary(&s[2..4], 2), Err(GraphemeIncomplete::PrevChunk));
assert_eq!(cursor.prev_boundary(&s[0..2], 0), Ok(Some(2)));
assert_eq!(cursor.prev_boundary(&s[0..2], 0), Ok(Some(1)));
assert_eq!(cursor.prev_boundary(&s[0..2], 0), Ok(Some(0)));
assert_eq!(cursor.prev_boundary(&s[0..2], 0), Ok(None));
Trait Implementations§
§impl Clone for GraphemeCursor
impl Clone for GraphemeCursor
§fn clone(&self) -> GraphemeCursor
fn clone(&self) -> GraphemeCursor
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moreAuto Trait Implementations§
impl Freeze for GraphemeCursor
impl RefUnwindSafe for GraphemeCursor
impl Send for GraphemeCursor
impl Sync for GraphemeCursor
impl Unpin for GraphemeCursor
impl UnwindSafe for GraphemeCursor
Blanket Implementations§
§impl<T> ArchivePointee for T
impl<T> ArchivePointee for T
§type ArchivedMetadata = ()
type ArchivedMetadata = ()
§fn pointer_metadata(
_: &<T as ArchivePointee>::ArchivedMetadata,
) -> <T as Pointee>::Metadata
fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> ByteSized for T
impl<T> ByteSized for T
Source§const BYTE_ALIGN: usize = _
const BYTE_ALIGN: usize = _
Source§fn byte_align(&self) -> usize ⓘ
fn byte_align(&self) -> usize ⓘ
Source§fn ptr_size_ratio(&self) -> [usize; 2]
fn ptr_size_ratio(&self) -> [usize; 2]
Source§impl<T, R> Chain<R> for Twhere
T: ?Sized,
impl<T, R> Chain<R> for Twhere
T: ?Sized,
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> ExtAny for T
impl<T> ExtAny for T
Source§fn as_any_mut(&mut self) -> &mut dyn Anywhere
Self: Sized,
fn as_any_mut(&mut self) -> &mut dyn Anywhere
Self: Sized,
Source§impl<T> ExtMem for Twhere
T: ?Sized,
impl<T> ExtMem for Twhere
T: ?Sized,
Source§const NEEDS_DROP: bool = _
const NEEDS_DROP: bool = _
Source§fn mem_align_of_val(&self) -> usize ⓘ
fn mem_align_of_val(&self) -> usize ⓘ
Source§fn mem_size_of_val(&self) -> usize ⓘ
fn mem_size_of_val(&self) -> usize ⓘ
Source§fn mem_needs_drop(&self) -> bool
fn mem_needs_drop(&self) -> bool
true
if dropping values of this type matters. Read moreSource§fn mem_forget(self)where
Self: Sized,
fn mem_forget(self)where
Self: Sized,
self
without running its destructor. Read moreSource§fn mem_replace(&mut self, other: Self) -> Selfwhere
Self: Sized,
fn mem_replace(&mut self, other: Self) -> Selfwhere
Self: Sized,
Source§unsafe fn mem_zeroed<T>() -> T
unsafe fn mem_zeroed<T>() -> T
unsafe_layout
only.T
represented by the all-zero byte-pattern. Read moreSource§unsafe fn mem_transmute_copy<Src, Dst>(src: &Src) -> Dst
unsafe fn mem_transmute_copy<Src, Dst>(src: &Src) -> Dst
unsafe_layout
only.T
represented by the all-zero byte-pattern. Read moreSource§fn mem_as_bytes(&self) -> &[u8] ⓘ
fn mem_as_bytes(&self) -> &[u8] ⓘ
unsafe_slice
only.§impl<S> FromSample<S> for S
impl<S> FromSample<S> for S
fn from_sample_(s: S) -> S
Source§impl<T> Hook for T
impl<T> Hook for T
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self> ⓘ
fn instrument(self, span: Span) -> Instrumented<Self> ⓘ
§fn in_current_span(self) -> Instrumented<Self> ⓘ
fn in_current_span(self) -> Instrumented<Self> ⓘ
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more§impl<F, T> IntoSample<T> for Fwhere
T: FromSample<F>,
impl<F, T> IntoSample<T> for Fwhere
T: FromSample<F>,
fn into_sample(self) -> T
§impl<T> LayoutRaw for T
impl<T> LayoutRaw for T
§fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError> ⓘ
fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError> ⓘ
§impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
§unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
§fn resolve_niched(out: Place<NichedOption<T, N1>>)
fn resolve_niched(out: Place<NichedOption<T, N1>>)
out
indicating that a T
is niched.