Python Strings
Working with Text: Understanding Strings in Python
Text is everywhere in the world of programming. We use text for names, messages, instructions, labels, file paths, and getting input from users. In Python, text is stored as a string.
We've already seen that the str
type is used for text. Now, let's take a closer look at how to work with strings – how to create them, access parts of them, do common text jobs, and understand how they behave in Python.
What is a String?
A string is an ordered sequence of characters. This means the characters are arranged in a specific order, and each character has a position.
You create a string by putting characters inside quotes. Python lets you use single quotes ('...'
), double quotes ("..."
), or triple quotes ('''...'''
or """..."""
).
# Creating strings using different quotes
single_quoted_string = 'Hello'
double_quoted_string = "World"
# Triple quotes are useful for strings that span multiple lines
multi_line_string = """This is a string
that covers
several lines."""
# Or for strings that contain both single and double quotes easily
mixed_quotes_string = '''He said, "Hello!" and I replied, 'Hi there!' '''
# A string can be empty
empty_string = ""
Remember from the Variables article that when you create a string value, Python creates a string object in memory, and your variable name points to that object.
Basic String Operations (Quick Review)
You've already seen how to combine and repeat strings using operators:
-
Concatenation (
+
): Joins two or more strings together.greeting = "Hello" + " " + "Python" print(greeting) # Output: Hello Python
-
Repetition (
*
): Repeats a string a certain number of times.separator = "=" * 15 print(separator) # Output: ===============
Accessing Parts of a String
Because strings are ordered sequences, you can access individual characters or get smaller parts of a string.
String Indexing (Getting One Character)
Each character in a string has a position number, called an index. Python uses 0-based indexing, meaning the index of the first character is 0
, the second character is 1
, and so on.
You access a character by putting its index number inside square brackets []
right after the string variable's name.
my_word = "Python"
# 'P' is at index 0
# 'y' is at index 1
# 't' is at index 2
# 'h' is at index 3
# 'o' is at index 4
# 'n' is at index 5
first_char = my_word[0]
print(first_char) # Output: P
third_char = my_word[2]
print(third_char) # Output: t
Python also allows negative indexing to count from the end of the string. The last character is at index -1
, the second to last is at -2
, and so on.
my_word = "Python"
last_char = my_word[-1]
print(last_char) # Output: n
second_last_char = my_word[-2]
print(second_last_char) # Output: o
If you try to access an index that doesn't exist (a number too high or too low), Python will give you an IndexError
(an error because the index is out of range).
String Slicing (Getting a Substring)
Slicing lets you get a sequence of characters (a part of the string, called a substring) by specifying a range of indices. You use the colon :
inside the square brackets.
The basic syntax is string[start:stop]
. This gives you characters starting from the start
index up to, but not including, the stop
index.
my_string = "Programming"
# Indices: 01234567890
# Get characters from index 0 up to (not including) index 3
part1 = my_string[0:3]
print(part1) # Output: Pro
# Get characters from index 4 up to (not including) index 8
part2 = my_string[4:8]
print(part2) # Output: ramm
You can also add a step: string[start:stop:step]
. The step
tells Python how many characters to skip (the default step is 1).
my_string = "Programming"
# Get characters from index 0 up to index 11, taking every 2nd character
part_with_step = my_string[0:11:2]
print(part_with_step) # Output: Pormig
# Using a step of -1 is a common way to reverse a string
reversed_string = my_string[::-1]
print(reversed_string) # Output: gnimmargorP
Python has default values for start
and stop
:
- If you leave out
start
, it defaults to the beginning of the string (index 0):my_string[:3]
is the same asmy_string[0:3]
. - If you leave out
stop
, it defaults to the end of the string:my_string[4:]
is the same asmy_string[4:len(my_string)]
. - If you leave out both, you get a copy of the whole string:
my_string[:]
.
String slicing always creates a new string object.
String Length (len()
)
To find out how many characters are in a string (its length), use the built-in len()
function.
my_string = "Python"
string_length = len(my_string)
print(string_length) # Output: 6
String Methods (Built-in Text Actions)
String objects in Python come with many built-in tools called methods that make common text tasks easy. You call a method using a dot .
after the string or variable name, followed by the method name and parentheses ()
.
Changing Case
These methods return a new string with the case changed:
.lower()
: Converts all characters to lowercase..upper()
: Converts all characters to uppercase..capitalize()
: Converts the first character of the string to uppercase and the rest to lowercase..title()
: Converts the first character of each word to uppercase and the rest to lowercase.
text = "hELLo wORLd"
print(text.lower()) # Output: hello world
print(text.upper()) # Output: HELLO WORLD
print(text.capitalize()) # Output: Hello world
print(text.title()) # Output: Hello World
Checking Content
These methods check what a string contains and return True
or False
(booleans).
.isdigit()
:True
if all characters are digits (0-9) and the string is not empty..isalpha()
:True
if all characters are letters (a-z, A-Z) and the string is not empty..isalnum()
:True
if all characters are letters or digits and the string is not empty..isspace()
:True
if all characters are whitespace (spaces, tabs, newlines) and the string is not empty.
print("123".isdigit()) # Output: True
print("Python".isalpha()) # Output: True
print("Py3".isalnum()) # Output: True
print(" ".isspace()) # Output: True
print("Hello World".isalpha()) # Output: False (because of the space)
print("".isspace()) # Output: False (empty string)
Finding and Replacing
.find(substring)
: Searches for asubstring
within the string and returns the index of the first time it appears. Returns-1
if the substring is not found..replace(old, new)
: Returns a new string where all occurrences of theold
substring are replaced with thenew
substring. You can add an optional third argument to limit how many replacements happen.
message = "Hello World, Hello Python"
print(message.find("World")) # Output: 6 (W is at index 6)
print(message.find("Python")) # Output: 19 (P is at index 19)
print(message.find("Java")) # Output: -1 (not found)
new_message = message.replace("Hello", "Hi")
print(new_message) # Output: Hi World, Hi Python
replace_once = message.replace("Hello", "Hi", 1) # Replace only the first "Hello"
print(replace_once) # Output: Hi World, Hello Python
Splitting and Joining
-
.split(separator)
: Divides a string into a list of substrings based on aseparator
string. If no separator is given, it splits by any whitespace (spaces, tabs, newlines).sentence = "This is a sample sentence" words = sentence.split(" ") # Split using space as the separator print(words) # Output: ['This', 'is', 'a', 'sample', 'sentence'] data = "item1,item2,item3" items = data.split(",") # Split using comma as the separator print(items) # Output: ['item1', 'item2', 'item3']
-
.join(iterable_of_strings)
: This is the opposite ofsplit()
. It takes an iterable (like a list or tuple) of strings and joins them together into a single string. The string you call the method on becomes the "glue" or separator between the items.word_list = ["Join", "these", "words"] joined_with_space = " ".join(word_list) print(joined_with_space) # Output: Join these words joined_with_dash = "-".join(word_list) print(joined_with_dash) # Output: Join-these-words chars_list = ['P', 'y', 't', 'h', 'o', 'n'] word = "".join(chars_list) # Use an empty string as glue for no separator print(word) # Output: Python
Removing Whitespace (strip
, lstrip
, rstrip
)
These methods remove whitespace characters (spaces, tabs, newlines) from the beginning or end of a string.
.strip()
: Removes whitespace from both the start and the end..lstrip()
: Removes whitespace only from the left (start)..rstrip()
: Removes whitespace only from the right (end).
padded_text = " Hello World \n"
print(padded_text.strip()) # Output: "Hello World"
print(padded_text.lstrip()) # Output: "Hello World \n"
print(padded_text.rstrip()) # Output: " Hello World"
String Formatting (Using f-strings)
As we saw in the Variables article, f-strings are usually the simplest and most readable way to create strings that mix fixed text with variable values or code results.
item_name = "Potion"
cost = 15
# Easily include variables using {} inside an f-string
print(f"You bought a {item_name} for {cost} gold.")
# Output: You bought a Potion for 15 gold.
# You can put simple math inside {} too
print(f"Next year, you will be {2023 + 1} years old.")
# Output: Next year, you will be 2024 years old.
# And use formatting for numbers (like float decimal places, seen in Data Types)
price = 99.5
print(f"Price: ${price:.2f}") # Format price to 2 decimal places
# Output: Price: $99.50
f-strings are the modern standard for formatting strings in Python.
Special String Features
Escape Characters
Sometimes you need to put characters in a string that are hard to type directly or have special meaning (like a newline or a quotation mark that matches the string's quotes). The backslash character (\
) is used to "escape" the next character, giving it a special meaning.
Here are some common escape sequences:
\n
: Newline (starts a new line)\t
: Tab (adds a tab space)\'
: Single quote (allows using a single quote inside a single-quoted string)\"
: Double quote (allows using a double quote inside a double-quoted string)\\
: Backslash (to include a literal backslash character)
print("This is line 1\nThis is line 2") # Prints two lines
print("Item\tQuantity") # Prints with a tab
# Using quotes that match the string's outer quotes requires escaping
print('He said, \'Hi!\'') # Output: He said, 'Hi!'
print("She said, \"Bye!\"") # Output: She said, "Bye!"
# Including a literal backslash (common in file paths on Windows)
print("Path: C:\\Users\\Admin") # You need two backslashes
# Output: Path: C:\Users\Admin
Raw Strings (r"..."
)
If you have a string with many backslashes that you don't want Python to interpret as escape characters (like in file paths or regular expressions), you can create a raw string by putting an r
before the opening quote. In a raw string, backslashes are treated as normal characters.
# Normal string requires escaping backslashes for a literal path
normal_path = "C:\\Users\\Name\\data.txt"
print(normal_path) # Output: C:\Users\Name\data.txt
# Raw string is simpler for literal paths
raw_path = r"C:\Users\Name\data.txt"
print(raw_path) # Output: C:\Users\Name\data.txt (Backslashes are not treated as escapes)
String Comparisons and Checks
You can use comparison operators (==
, !=
, >
, <
, >=
, <=
) and membership operators (in
, not in
) with strings.
-
Equality (
==
,!=
): Checks if two strings have the exact same sequence of characters. Comparison is case-sensitive.print("apple" == "apple") # Output: True print("Apple" == "apple") # Output: False (different case) print("apple" != "banana") # Output: True
-
Order (
>
,<
, etc.): Compares strings based on the alphabetical (or more technically, Unicode) order of their characters. This is how Python sorts strings.print("apple" < "banana") # Output: True ('a' comes before 'b') print("cat" > "car") # Output: True ('t' comes after 'r') # Uppercase letters come before lowercase in standard Unicode order print("Apple" < "apple") # Output: True
-
Membership (
in
,not in
): Checks if a character or a shorter string (substring) is found anywhere within a larger string.message = "Hello World" print("World" in message) # Output: True (The substring "World" is in "Hello World") print("elo" in message) # Output: True (The substring "elo" is in "Hello World") print("z" in message) # Output: False
Immutability of Strings
It's very important to remember that string objects in Python are immutable. This means that once a string object is created in memory, you cannot change its contents directly.
If you try to change a character at a specific index, you will get a TypeError
:
my_string = "Hello"
# my_string[0] = "J" # This line would cause a TypeError!
# TypeError: 'str' object does not support item assignment
When you use string methods like .upper()
, .replace()
, or even slicing [:]
, Python does not modify the original string object. Instead, these operations calculate a new result and return a new string object containing that result.
If you want your variable name to point to the new, modified string, you must use the assignment operator (=
) to make the variable point to the new object that the method returned:
my_string = "hello"
# 'my_string' points to the object "hello"
# my_string.upper() calculates a new string "HELLO" and returns it.
# 'my_string' still points to "hello". The result "HELLO" is returned but not stored anywhere here.
my_string.upper()
print(my_string) # Output: hello (Original string unchanged)
# To make the variable point to the new string:
my_string = my_string.upper() # Now 'my_string' points to the new object "HELLO"
print(my_string) # Output: HELLO (Variable now points to the new string)
Understanding immutability is key to correctly working with strings and many other types in Python.
Conclusion
Strings are fundamental for handling all kinds of text data in your Python programs. Learning how to create, access parts of, manipulate, and format strings is a core skill.
You've learned about:
- Creating strings using different quotes.
- Basic operations: joining (
+
) and repeating (*
). - Accessing individual characters using indexing (
[]
). - Getting parts of strings using slicing (
[:]
). - Checking length with
len()
. - Many useful string methods for changing case, finding/replacing text, splitting/joining text, and cleaning up whitespace.
- Handling special characters using escape sequences (
\n
,\t
, etc.) and raw strings (r"..."
). - Comparing strings (
==
,>
, etc.) and checking if characters or substrings are in a string. - The crucial concept of string immutability – methods return new strings, they don't change the original.
Strings are versatile and powerful. Practice using the various methods and techniques to confidently work with text data in your projects!