| <!DOCTYPE html> |
| <html lang="en"> |
| <head> |
| <meta charset="utf-8"> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> |
| <meta name="generator" content="rustdoc"> |
| <meta name="description" content="API documentation for the Rust `RegexSet` struct in crate `regex`."> |
| <meta name="keywords" content="rust, rustlang, rust-lang, RegexSet"> |
| |
| <title>regex::RegexSet - Rust</title> |
| |
| <link rel="stylesheet" type="text/css" href="../normalize.css"> |
| <link rel="stylesheet" type="text/css" href="../rustdoc.css"> |
| <link rel="stylesheet" type="text/css" href="../main.css"> |
| |
| |
| <link rel="shortcut icon" href="https://www.rust-lang.org/favicon.ico"> |
| |
| </head> |
| <body class="rustdoc struct"> |
| <!--[if lte IE 8]> |
| <div class="warning"> |
| This old browser is unsupported and will most likely display funky |
| things. |
| </div> |
| <![endif]--> |
| |
| |
| |
| <nav class="sidebar"> |
| <a href='../regex/index.html'><img src='https://www.rust-lang.org/logos/rust-logo-128x128-blk-v2.png' alt='logo' width='100'></a> |
| <p class='location'>Struct RegexSet</p><div class="block items"><ul><li><a href="#methods">Methods</a></li><li><a href="#implementations">Trait Implementations</a></li></ul></div><p class='location'><a href='index.html'>regex</a></p><script>window.sidebarCurrent = {name: 'RegexSet', ty: 'struct', relpath: ''};</script><script defer src="sidebar-items.js"></script> |
| </nav> |
| |
| <nav class="sub"> |
| <form class="search-form js-only"> |
| <div class="search-container"> |
| <input class="search-input" name="search" |
| autocomplete="off" |
| placeholder="Click or press ‘S’ to search, ‘?’ for more options…" |
| type="search"> |
| </div> |
| </form> |
| </nav> |
| |
| <section id='main' class="content"> |
| <h1 class='fqn'><span class='in-band'>Struct <a href='index.html'>regex</a>::<wbr><a class="struct" href=''>RegexSet</a></span><span class='out-of-band'><span id='render-detail'> |
| <a id="toggle-all-docs" href="javascript:void(0)" title="collapse all docs"> |
| [<span class='inner'>−</span>] |
| </a> |
| </span><a class='srclink' href='../src/regex/re_set.rs.html#105' title='goto source code'>[src]</a></span></h1> |
| <pre class='rust struct'>pub struct RegexSet(_);</pre><div class='docblock'><p>Match multiple (possibly overlapping) regular expressions in a single scan.</p> |
| |
| <p>A regex set corresponds to the union of two or more regular expressions. |
| That is, a regex set will match text where at least one of its |
| constituent regular expressions matches. A regex set as its formulated here |
| provides a touch more power: it will also report <em>which</em> regular |
| expressions in the set match. Indeed, this is the key difference between |
| regex sets and a single <code>Regex</code> with many alternates, since only one |
| alternate can match at a time.</p> |
| |
| <p>For example, consider regular expressions to match email addresses and |
| domains: <code>[a-z]+@[a-z]+\.(com|org|net)</code> and <code>[a-z]+\.(com|org|net)</code>. If a |
| regex set is constructed from those regexes, then searching the text |
| <code>foo@example.com</code> will report both regexes as matching. Of course, one |
| could accomplish this by compiling each regex on its own and doing two |
| searches over the text. The key advantage of using a regex set is that it |
| will report the matching regexes using a <em>single pass through the text</em>. |
| If one has hundreds or thousands of regexes to match repeatedly (like a URL |
| router for a complex web application or a user agent matcher), then a regex |
| set can realize huge performance gains.</p> |
| |
| <h1 id='example' class='section-header'><a href='#example'>Example</a></h1> |
| <p>This shows how the above two regexes (for matching email addresses and |
| domains) might work:</p> |
| |
| <pre class="rust rust-example-rendered"> |
| <span class="kw">let</span> <span class="ident">set</span> <span class="op">=</span> <span class="ident">RegexSet</span>::<span class="ident">new</span>(<span class="kw-2">&</span>[ |
| <span class="string">r"[a-z]+@[a-z]+\.(com|org|net)"</span>, |
| <span class="string">r"[a-z]+\.(com|org|net)"</span>, |
| ]).<span class="ident">unwrap</span>(); |
| |
| <span class="comment">// Ask whether any regexes in the set match.</span> |
| <span class="macro">assert</span><span class="macro">!</span>(<span class="ident">set</span>.<span class="ident">is_match</span>(<span class="string">"foo@example.com"</span>)); |
| |
| <span class="comment">// Identify which regexes in the set match.</span> |
| <span class="kw">let</span> <span class="ident">matches</span>: <span class="ident">Vec</span><span class="op"><</span>_<span class="op">></span> <span class="op">=</span> <span class="ident">set</span>.<span class="ident">matches</span>(<span class="string">"foo@example.com"</span>).<span class="ident">into_iter</span>().<span class="ident">collect</span>(); |
| <span class="macro">assert_eq</span><span class="macro">!</span>(<span class="macro">vec</span><span class="macro">!</span>[<span class="number">0</span>, <span class="number">1</span>], <span class="ident">matches</span>); |
| |
| <span class="comment">// Try again, but with text that only matches one of the regexes.</span> |
| <span class="kw">let</span> <span class="ident">matches</span>: <span class="ident">Vec</span><span class="op"><</span>_<span class="op">></span> <span class="op">=</span> <span class="ident">set</span>.<span class="ident">matches</span>(<span class="string">"example.com"</span>).<span class="ident">into_iter</span>().<span class="ident">collect</span>(); |
| <span class="macro">assert_eq</span><span class="macro">!</span>(<span class="macro">vec</span><span class="macro">!</span>[<span class="number">1</span>], <span class="ident">matches</span>); |
| |
| <span class="comment">// Try again, but with text that doesn't match any regex in the set.</span> |
| <span class="kw">let</span> <span class="ident">matches</span>: <span class="ident">Vec</span><span class="op"><</span>_<span class="op">></span> <span class="op">=</span> <span class="ident">set</span>.<span class="ident">matches</span>(<span class="string">"example"</span>).<span class="ident">into_iter</span>().<span class="ident">collect</span>(); |
| <span class="macro">assert</span><span class="macro">!</span>(<span class="ident">matches</span>.<span class="ident">is_empty</span>());</pre> |
| |
| <p>Note that it would be possible to adapt the above example to using <code>Regex</code> |
| with an expression like:</p> |
| |
| <pre class="rust rust-example-rendered"> |
| (<span class="question-mark">?</span><span class="ident">P</span><span class="op"><</span><span class="ident">email</span><span class="op">></span>[<span class="ident">a</span><span class="op">-</span><span class="ident">z</span>]<span class="op">+</span>@(<span class="question-mark">?</span><span class="ident">P</span><span class="op"><</span><span class="ident">email_domain</span><span class="op">></span>[<span class="ident">a</span><span class="op">-</span><span class="ident">z</span>]<span class="op">+</span>[.](<span class="ident">com</span><span class="op">|</span><span class="ident">org</span><span class="op">|</span><span class="ident">net</span>)))<span class="op">|</span>(<span class="question-mark">?</span><span class="ident">P</span><span class="op"><</span><span class="ident">domain</span><span class="op">></span>[<span class="ident">a</span><span class="op">-</span><span class="ident">z</span>]<span class="op">+</span>[.](<span class="ident">com</span><span class="op">|</span><span class="ident">org</span><span class="op">|</span><span class="ident">net</span>))</pre> |
| |
| <p>After a match, one could then inspect the capture groups to figure out |
| which alternates matched. The problem is that it is hard to make this |
| approach scale when there are many regexes since the overlap between each |
| alternate isn't always obvious to reason about.</p> |
| |
| <h1 id='limitations' class='section-header'><a href='#limitations'>Limitations</a></h1> |
| <p>Regex sets are limited to answering the following two questions:</p> |
| |
| <ol> |
| <li>Does any regex in the set match?</li> |
| <li>If so, which regexes in the set match?</li> |
| </ol> |
| |
| <p>As with the main <code>Regex</code> type, it is cheaper to ask (1) instead of (2) |
| since the matching engines can stop after the first match is found.</p> |
| |
| <p>Other features like finding the location of successive matches or their |
| sub-captures aren't supported. If you need this functionality, the |
| recommended approach is to compile each regex in the set independently and |
| selectively match them based on which regexes in the set matched.</p> |
| |
| <h1 id='performance' class='section-header'><a href='#performance'>Performance</a></h1> |
| <p>A <code>RegexSet</code> has the same performance characteristics as <code>Regex</code>. Namely, |
| search takes <code>O(mn)</code> time, where <code>m</code> is proportional to the size of the |
| regex set and <code>n</code> is proportional to the length of the search text.</p> |
| </div><h2 id='methods'>Methods</h2><h3 class='impl'><span class='in-band'><code>impl <a class="struct" href="../regex/struct.RegexSet.html" title="struct regex::RegexSet">RegexSet</a></code></span><span class='out-of-band'><div class='ghost'></div><a class='srclink' href='../src/regex/re_set.rs.html#107-207' title='goto source code'>[src]</a></span></h3> |
| <div class='impl-items'><h4 id='method.new' class="method"><span id='new.v' class='invisible'><code>fn <a href='#method.new' class='fnname'>new</a><I, S>(exprs: I) -> <a class="enum" href="https://doc.rust-lang.org/nightly/core/result/enum.Result.html" title="enum core::result::Result">Result</a><<a class="struct" href="../regex/struct.RegexSet.html" title="struct regex::RegexSet">RegexSet</a>, <a class="enum" href="../regex/enum.Error.html" title="enum regex::Error">Error</a>> <span class="where fmt-newline">where<br> S: <a class="trait" href="https://doc.rust-lang.org/nightly/core/convert/trait.AsRef.html" title="trait core::convert::AsRef">AsRef</a><<a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.str.html">str</a>>,<br> I: <a class="trait" href="https://doc.rust-lang.org/nightly/core/iter/traits/trait.IntoIterator.html" title="trait core::iter::traits::IntoIterator">IntoIterator</a><Item = S>, </span></code></span></h4> |
| <div class='docblock'><p>Create a new regex set with the given regular expressions.</p> |
| |
| <p>This takes an iterator of <code>S</code>, where <code>S</code> is something that can produce |
| a <code>&str</code>. If any of the strings in the iterator are not valid regular |
| expressions, then an error is returned.</p> |
| |
| <h1 id='example-1' class='section-header'><a href='#example-1'>Example</a></h1> |
| <p>Create a new regex set from an iterator of strings:</p> |
| |
| <pre class="rust rust-example-rendered"> |
| <span class="kw">let</span> <span class="ident">set</span> <span class="op">=</span> <span class="ident">RegexSet</span>::<span class="ident">new</span>(<span class="kw-2">&</span>[<span class="string">r"\w+"</span>, <span class="string">r"\d+"</span>]).<span class="ident">unwrap</span>(); |
| <span class="macro">assert</span><span class="macro">!</span>(<span class="ident">set</span>.<span class="ident">is_match</span>(<span class="string">"foo"</span>));</pre> |
| </div><h4 id='method.is_match' class="method"><span id='is_match.v' class='invisible'><code>fn <a href='#method.is_match' class='fnname'>is_match</a>(&self, text: &<a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.str.html">str</a>) -> <a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.bool.html">bool</a></code></span></h4> |
| <div class='docblock'><p>Returns true if and only if one of the regexes in this set matches |
| the text given.</p> |
| |
| <p>This method should be preferred if you only need to test whether any |
| of the regexes in the set should match, but don't care about <em>which</em> |
| regexes matched. This is because the underlying matching engine will |
| quit immediately after seeing the first match instead of continuing to |
| find all matches.</p> |
| |
| <p>Note that as with searches using <code>Regex</code>, the expression is unanchored |
| by default. That is, if the regex does not start with <code>^</code> or <code>\A</code>, or |
| end with <code>$</code> or <code>\z</code>, then it is permitted to match anywhere in the |
| text.</p> |
| |
| <h1 id='example-2' class='section-header'><a href='#example-2'>Example</a></h1> |
| <p>Tests whether a set matches some text:</p> |
| |
| <pre class="rust rust-example-rendered"> |
| <span class="kw">let</span> <span class="ident">set</span> <span class="op">=</span> <span class="ident">RegexSet</span>::<span class="ident">new</span>(<span class="kw-2">&</span>[<span class="string">r"\w+"</span>, <span class="string">r"\d+"</span>]).<span class="ident">unwrap</span>(); |
| <span class="macro">assert</span><span class="macro">!</span>(<span class="ident">set</span>.<span class="ident">is_match</span>(<span class="string">"foo"</span>)); |
| <span class="macro">assert</span><span class="macro">!</span>(<span class="op">!</span><span class="ident">set</span>.<span class="ident">is_match</span>(<span class="string">"☃"</span>));</pre> |
| </div><h4 id='method.matches' class="method"><span id='matches.v' class='invisible'><code>fn <a href='#method.matches' class='fnname'>matches</a>(&self, text: &<a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.str.html">str</a>) -> <a class="struct" href="../regex/struct.SetMatches.html" title="struct regex::SetMatches">SetMatches</a></code></span></h4> |
| <div class='docblock'><p>Returns the set of regular expressions that match in the given text.</p> |
| |
| <p>The set returned contains the index of each regular expression that |
| matches in the given text. The index is in correspondence with the |
| order of regular expressions given to <code>RegexSet</code>'s constructor.</p> |
| |
| <p>The set can also be used to iterate over the matched indices.</p> |
| |
| <p>Note that as with searches using <code>Regex</code>, the expression is unanchored |
| by default. That is, if the regex does not start with <code>^</code> or <code>\A</code>, or |
| end with <code>$</code> or <code>\z</code>, then it is permitted to match anywhere in the |
| text.</p> |
| |
| <h1 id='example-3' class='section-header'><a href='#example-3'>Example</a></h1> |
| <p>Tests which regular expressions match the given text:</p> |
| |
| <pre class="rust rust-example-rendered"> |
| <span class="kw">let</span> <span class="ident">set</span> <span class="op">=</span> <span class="ident">RegexSet</span>::<span class="ident">new</span>(<span class="kw-2">&</span>[ |
| <span class="string">r"\w+"</span>, |
| <span class="string">r"\d+"</span>, |
| <span class="string">r"\pL+"</span>, |
| <span class="string">r"foo"</span>, |
| <span class="string">r"bar"</span>, |
| <span class="string">r"barfoo"</span>, |
| <span class="string">r"foobar"</span>, |
| ]).<span class="ident">unwrap</span>(); |
| <span class="kw">let</span> <span class="ident">matches</span>: <span class="ident">Vec</span><span class="op"><</span>_<span class="op">></span> <span class="op">=</span> <span class="ident">set</span>.<span class="ident">matches</span>(<span class="string">"foobar"</span>).<span class="ident">into_iter</span>().<span class="ident">collect</span>(); |
| <span class="macro">assert_eq</span><span class="macro">!</span>(<span class="ident">matches</span>, <span class="macro">vec</span><span class="macro">!</span>[<span class="number">0</span>, <span class="number">2</span>, <span class="number">3</span>, <span class="number">4</span>, <span class="number">6</span>]); |
| |
| <span class="comment">// You can also test whether a particular regex matched:</span> |
| <span class="kw">let</span> <span class="ident">matches</span> <span class="op">=</span> <span class="ident">set</span>.<span class="ident">matches</span>(<span class="string">"foobar"</span>); |
| <span class="macro">assert</span><span class="macro">!</span>(<span class="op">!</span><span class="ident">matches</span>.<span class="ident">matched</span>(<span class="number">5</span>)); |
| <span class="macro">assert</span><span class="macro">!</span>(<span class="ident">matches</span>.<span class="ident">matched</span>(<span class="number">6</span>));</pre> |
| </div><h4 id='method.len' class="method"><span id='len.v' class='invisible'><code>fn <a href='#method.len' class='fnname'>len</a>(&self) -> <a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a></code></span></h4> |
| <div class='docblock'><p>Returns the total number of regular expressions in this set.</p> |
| </div></div><h2 id='implementations'>Trait Implementations</h2><h3 class='impl'><span class='in-band'><code>impl <a class="trait" href="https://doc.rust-lang.org/nightly/core/clone/trait.Clone.html" title="trait core::clone::Clone">Clone</a> for <a class="struct" href="../regex/struct.RegexSet.html" title="struct regex::RegexSet">RegexSet</a></code></span><span class='out-of-band'><div class='ghost'></div><a class='srclink' href='../src/regex/re_set.rs.html#104' title='goto source code'>[src]</a></span></h3> |
| <div class='impl-items'><h4 id='method.clone' class="method"><span id='clone.v' class='invisible'><code>fn <a href='https://doc.rust-lang.org/nightly/core/clone/trait.Clone.html#tymethod.clone' class='fnname'>clone</a>(&self) -> <a class="struct" href="../regex/struct.RegexSet.html" title="struct regex::RegexSet">RegexSet</a></code></span></h4> |
| <div class='docblock'><p>Returns a copy of the value. <a href="https://doc.rust-lang.org/nightly/core/clone/trait.Clone.html#tymethod.clone">Read more</a></p> |
| </div><h4 id='method.clone_from' class="method"><span id='clone_from.v' class='invisible'><code>fn <a href='https://doc.rust-lang.org/nightly/core/clone/trait.Clone.html#method.clone_from' class='fnname'>clone_from</a>(&mut self, source: &Self)</code><div class='since' title='Stable since Rust version 1.0.0'>1.0.0</div></span></h4> |
| <div class='docblock'><p>Performs copy-assignment from <code>source</code>. <a href="https://doc.rust-lang.org/nightly/core/clone/trait.Clone.html#method.clone_from">Read more</a></p> |
| </div></div><h3 class='impl'><span class='in-band'><code>impl <a class="trait" href="https://doc.rust-lang.org/nightly/core/fmt/trait.Debug.html" title="trait core::fmt::Debug">Debug</a> for <a class="struct" href="../regex/struct.RegexSet.html" title="struct regex::RegexSet">RegexSet</a></code></span><span class='out-of-band'><div class='ghost'></div><a class='srclink' href='../src/regex/re_set.rs.html#331-335' title='goto source code'>[src]</a></span></h3> |
| <div class='impl-items'><h4 id='method.fmt' class="method"><span id='fmt.v' class='invisible'><code>fn <a href='https://doc.rust-lang.org/nightly/core/fmt/trait.Debug.html#tymethod.fmt' class='fnname'>fmt</a>(&self, f: &mut <a class="struct" href="https://doc.rust-lang.org/nightly/core/fmt/struct.Formatter.html" title="struct core::fmt::Formatter">Formatter</a>) -> <a class="type" href="https://doc.rust-lang.org/nightly/core/fmt/type.Result.html" title="type core::fmt::Result">Result</a></code></span></h4> |
| <div class='docblock'><p>Formats the value using the given formatter.</p> |
| </div></div></section> |
| <section id='search' class="content hidden"></section> |
| |
| <section class="footer"></section> |
| |
| <aside id="help" class="hidden"> |
| <div> |
| <h1 class="hidden">Help</h1> |
| |
| <div class="shortcuts"> |
| <h2>Keyboard Shortcuts</h2> |
| |
| <dl> |
| <dt>?</dt> |
| <dd>Show this help dialog</dd> |
| <dt>S</dt> |
| <dd>Focus the search field</dd> |
| <dt>⇤</dt> |
| <dd>Move up in search results</dd> |
| <dt>⇥</dt> |
| <dd>Move down in search results</dd> |
| <dt>⏎</dt> |
| <dd>Go to active search result</dd> |
| <dt>+</dt> |
| <dd>Collapse/expand all sections</dd> |
| </dl> |
| </div> |
| |
| <div class="infos"> |
| <h2>Search Tricks</h2> |
| |
| <p> |
| Prefix searches with a type followed by a colon (e.g. |
| <code>fn:</code>) to restrict the search to a given type. |
| </p> |
| |
| <p> |
| Accepted types are: <code>fn</code>, <code>mod</code>, |
| <code>struct</code>, <code>enum</code>, |
| <code>trait</code>, <code>type</code>, <code>macro</code>, |
| and <code>const</code>. |
| </p> |
| |
| <p> |
| Search functions by type signature (e.g. |
| <code>vec -> usize</code> or <code>* -> vec</code>) |
| </p> |
| </div> |
| </div> |
| </aside> |
| |
| |
| |
| <script> |
| window.rootPath = "../"; |
| window.currentCrate = "regex"; |
| </script> |
| <script src="../main.js"></script> |
| <script defer src="../search-index.js"></script> |
| </body> |
| </html> |