Appearance
Java Regular Expressions
Imagine this: you're go through a pile of documents, trying to find all the email addresses in them. With Java Regular Expressions, you can create a special pattern that tells Java exactly what to look for. So instead of manually scanning every word, you just give Java the pattern, and it'll do the heavy lifting for you.
But wait, there's more! Regular expressions aren't just for finding things; they can also help you validate data. Want to make sure a user's input matches a specific format, like a phone number or a ZIP code? Regex has got you covered.
Now, let's talk about how it works. In Java, you use the java.util.regex
package to work with regular expressions. You create a Pattern
object, which represents your regex pattern, and then use it to create a Matcher
object, which can search through text for matches.
Once you've got your Matcher
, you can start searching, replacing, or validating text based on your regex pattern. It's like having a powerful search-and-replace tool built right into your Java code!
Regex can be a bit tricky to master at first. The syntax can seem mysterious, and creating the right pattern might take some trial and error. But fear not! With a bit of practice and patience, you'll be wielding regex like a pro in no time.
Regular Expression Symbols
Symbol | Description |
---|---|
. | Matches any single character except newline |
^ | Matches the beginning of a line |
$ | Matches the end of a line |
* | Matches zero or more occurrences of the previous character or group |
+ | Matches one or more occurrences of the previous character or group |
? | Matches zero or one occurrence of the previous character or group |
Specifies the exact number of occurrences or a range of occurrences | |
[ ] | Defines a character class, matches any single character within the brackets |
( ) | Groups patterns together |
| | Acts as an OR operator, matches either the expression before or after |
\ | Escapes a special character, or indicates a special sequence |
\d | Matches any digit (equivalent to [0-9]) |
\D | Matches any non-digit character (equivalent to [^0-9]) |
\w | Matches any word character (equivalent to [a-zA-Z0-9_]) |
\W | Matches any non-word character (equivalent to [^a-zA-Z0-9_]) |
\s | Matches any whitespace character (space, tab, newline, etc.) |
\S | Matches any non-whitespace character |
\b | Matches a word boundary |
\B | Matches a non-word boundary |
\n | Matches a newline character |
\t | Matches a tab character |
\r | Matches a carriage return character |
\0 | Matches a null character |
\xhh | Matches the ASCII character represented by the hexadecimal number hh |
\uhhhh | Matches the Unicode character represented by the hexadecimal number hhhh |
[xyz] | Matches any single character from the character set xyz |
[^xyz] | Matches any single character not in the character set xyz |
[a-z] | Matches any single character in the range from 'a' to 'z' |
[^a-z] | Matches any single character not in the range from 'a' to 'z' |
Examples
Getting Started with Pattern Matching
Let's start with a simple example. Suppose you want to check if a string contains the word "Hello". Here's how you can do it using regular expressions in Java:
java
import java.util.regex.*;
public class RegexExample {
public static void main(String[] args) {
String text = "Hello, world!";
String pattern = ".*Hello.*"; // .* means zero or more characters
boolean isMatch = Pattern.matches(pattern, text);
System.out.println("Is there a match? " + isMatch);
}
}
In this example, we use the Pattern.matches()
method to check if our text
string matches the pattern
. The .*
in the pattern means zero or more characters, so it will match any text that contains the word "Hello".
NOTE
The above pattern is case-sensitive
Power of Character Classes
Now, let's say you want to find all the vowels in a string. You can use character classes in regular expressions to achieve this:
java
import java.util.regex.*;
public class RegexExample {
public static void main(String[] args) {
String text = "Hello, world!";
String pattern = "[aeiou]";
Matcher matcher = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE).matcher(text);
while (matcher.find()) {
System.out.println("Found: " + matcher.group());
}
}
}
In this example, we use the character class [aeiou]
to match any vowel in the string. The Pattern.CASE_INSENSITIVE
flag ensures that the matching is case-insensitive, so it will match both uppercase and lowercase vowels.
Exploring Quantifiers and Anchors
Quantifiers and anchors are powerful tools that allow you to specify how many times a character or group of characters should appear, as well as where they should appear in the string.
For example, let's say you want to match a phone number in the format XXX-XXX-XXXX. You can use quantifiers and anchors to achieve this:
java
import java.util.regex.*;
public class RegexExample {
public static void main(String[] args) {
String text = "Call me at 123-456-7890";
String pattern = "\\b\\d{3}-\\d{3}-\\d{4}\\b"; // \b represents word boundaries
Matcher matcher = Pattern.compile(pattern).matcher(text);
while (matcher.find()) {
System.out.println("Found phone number: " + matcher.group());
}
}
}
In this example, the \d{3}-\d{3}-\d{4}
pattern matches a sequence of three digits followed by a hyphen, repeated three times, followed by four digits. The \b
represents word boundaries, ensuring that we match the entire phone number and not just a part of it.