The CLEAN function
Using the CLEAN function removes nonprintable characters text. For example,
if the text labels shown in a column are using crazy nonprintable characters
that end up showing as solid blocks or goofy symbols, you can use the
CLEAN function to clean up this text. The cleaned‐up text can be stored in
another column. You can then work with the cleaned text column.
The CLEAN function uses the following syntax:
CLEAN(text)
The text argument is the text string or a reference to the cell holding the text
string that you want to clean. For example, to clean the text stored in Cell A1,
use the following syntax:
CLEAN(A1)
The CONCATENATE function
The CONCATENATE function combines, or joins, chunks of text into a single
text string. The CONCATENATE function uses the following syntax:
CONCATENATE(text1,text2,text3,...)
The text1, text2, text3, and so on arguments are the chunks of text that
you want to combine into a single string. For example, if the city, state, and
ZIP code were stored in fields named city, state, and zip, you could
create a single text string that stores this information by using the following
syntax:
CONCATENATE(city,state,zip)
If city were Redmond, state were WA, and zip were 98052, this function
returns this text string:
RedmondWA98052
The smashed together nature of the concatenated city, state, and ZIP code
information isn’t a typographical mistake, by the way. To concatenate this
information but include spaces, you need to include spaces as function arguments.
For example, the following syntax:
CONCATENATE(“Redmond”, “ ”,“WA”, “ ”,“98052”)
returns the text string
Redmond WA 98052
The EXACT function
The EXACT function compares two text strings. If the two text strings are
exactly the same, the EXACT function returns the logical value for true, which
is 1. If the two text strings differ in any way, the EXACT function returns the
logical value for false, which is 0. The EXACT function is case‐sensitive. For
example, Redmond spelled with a capital R differs from redmond spelled with
a lowercase r.
The EXACT function uses the following syntax:
EXACT(text1,text2)
The text1 and text2 arguments are the text strings that you want to
compare.
For example, to check whether the two strings "Redmond" and
"redmond" are the same, use the following formula:
EXACT("Redmond","redmond")
This function returns the logical value for false, 0, because these two text
strings don’t match exactly. One begins with an uppercase R and the other
begins with a lowercase r.
The FIND function
The FIND function finds the starting character position of one text string
within another text string. For example, if you want to know at what position
within a text string the two‐letter state abbreviation WA starts, you could use
the FIND function.
The FIND function uses the following syntax:
FIND(find_text,within_text,start_num)
The find_text argument is the text that you’re looking for. The within_
text argument identifies where or what you’re searching. The start_num
argument tells Excel at what point within the string it should begin its search.
For example, to find at what point the two‐letter state abbreviation WA begins
in the string Redmond WA 98052, use the following formula:
FIND(“WA”,“Redmond WA 98052”,1)
The function returns the value 9 because WA begins at the ninth position
(because spaces are counted).
The start_num function argument is optional. If you omit this argument,
Excel begins searching at the very beginning of the string.
The FIXED function
The FIXED function rounds a value to specified precision and then converts
the rounded value to text. The function uses the following syntax:
FIXED(number,decimals,no_commas)
The number argument supplies the value that you want to round and convert
to text. The optional decimals argument tells Excel how many places to the
right of the decimal point that you want to round. The optional no_commas
argument needs to be either 1 (if you want commas) or 0 (if you don’t want
commas) in the returned text.
For example, to round to a whole number and convert to text the value
1234.56789, use the following formula:
FIXED(1234.56789,0,1)
The function returns the text 1,235.
The LEFT function
The LEFT function returns a specified number of characters from the left end
of a text string. The function uses the following syntax:
LEFT(text,num_chars)
The text argument either supplies the text string or references the cell holding
the text string. The optional num_chars argument tells Excel how many
characters to grab.
For example, to grab the leftmost seven characters from the text string
Redmond WA, use the following formula:
LEFT(“Redmond WA”,7)
The function returns the text Redmond.
The LEN function
The LEN function counts the number of characters in a text string. The function
uses the following syntax:
LEN(text)
The text argument either supplies the text string that you want to measure
or references the cell holding the text string. For example, to measure the
length of the text string in cell I81, use the following formula:
LEN(I81)
If cell I81 holds the text string Semper fidelis, the function returns the
value 14. Spaces are counted as characters, too.
The LOWER function
The LOWER function returns an all‐lowercase version of a text string. The
function uses the following syntax:
LOWER(text)
The text argument either supplies the text string that you want to convert
or references the cell holding the text string. For example, to convert the text
string PROFESSIONAL to professional, use the following formula:
LOWER(“PROFESSIONAL”)
The function returns professional.
The MID function
The MID function returns a chunk of text in the middle of text string. The
function uses the following syntax:
MID(text,start_num,num_char)
The text argument either supplies the text string from which you grab some
text fragment or it references the cell holding the text string. The start_num
argument tells Excel where the text fragment starts that you want to grab.
The num_char argument tells Excel how long the text fragment is. For example,
to grab the text fragment tac from the text string tic tac toe, use the
following formula:
=MID("tic tac toe",5,3)
The function returns tac.
The PROPER function
The PROPER function capitalizes the first letter in every word in a text string.
The function uses the following syntax:
PROPER(text)
The text argument either supplies the text string or references the cell
holding the text string. For example, to capitalize the initial letters in the text
string ambassador kennedy, use the following formula:
PROPER("ambassador kennedy")
The function returns the text string Ambassador Kennedy.
The REPLACE function
The REPLACE function replaces a portion of a text string. The function uses
the following syntax:
REPLACE(old_text,start_num,num_chars,new_text)
The old_text argument, which is case‐sensitive, either supplies the text
string from which you grab some text fragment or it references the cell holding
the text string. The start_num argument, which is the starting position,
tells Excel where the text starts that you want to replace. The num_chars
argument tells Excel the length of the text fragment (how many characters)
that you want to replace. The new_text argument, also case‐sensitive, tells
Excel what new text you want to use to replace the old text. For example, to
replace the name Chamberlain with the name Churchill in the text string
Mr. Chamberlain, use the following formula:
REPLACE("Mr. Chamberlain",5,11,"Churchill")
The function returns the text string Mr. Churchill.
The REPT function
The REPT function repeats a text string. The function uses the following
syntax:
REPT(text,number_times)
Chapter 3: Scrub‐a‐Dub‐Dub: Cleaning Data 73
The text argument either supplies the text string or references the cell holding
the text string. The number_times argument tells Excel how many times
you want to repeat the text. For example, the following formula:
REPT("Walla",2")
returns the text string WallaWalla.
The RIGHT function
The RIGHT function returns a specified number of characters from the right
end of a text string. The function uses the following syntax:
RIGHT(text,num_chars)
The text argument either supplies the text string that you want to manipulate
or references the cell holding the text string. The num_chars argument
tells Excel how many characters to grab.
For example, to grab the rightmost two characters from the text string
Redmond WA, use the following formula:
RIGHT(“Redmond WA”,2)
The function returns the text WA.
The SEARCH function
The SEARCH function calculates the starting position of a text fragment
within a text string. The function uses the following syntax:
SEARCH(find_text,within_text,start_num)
The find_text argument tells Excel what text fragment you’re looking for.
The within_text argument tells Excel what text string that you want to
search. The start_num argument tells Excel where to start its search. The
start_num argument is optional. If you leave it blank, Excel starts the search
at the beginning of the within_text string.
For example, to identify the position at which the text fragment Churchill
starts in the text string Mr. Churchill, use the following formula:
SEARCH("Churchill","Mr. Churchill",1)
The function returns the value 5.
The SUBSTITUTE function
The SUBSTITUTE function replaces occurrences of text in a text string. The
function uses the following syntax:
SUBSTITUTE(text,old_text,new_text,instances)
The text argument tells Excel what text string you want to edit by replacing
some text fragment. The old_text argument identifies the to‐be‐replaced
text fragment. The new_text supplies the new replacement text.
As an example of how the SUBSTITUTE function works, suppose that you
need to replace the word Senator with the word President in the text
string Senator Obama.
SUBSTITUTE("Senator Obama","Senator","President")
The function returns the text string President Obama.
The instances argument is optional, but you can use it to tell Excel for
which instance of old_text you want to make the substitution. For example,
the function
SUBSTITUTE("Senator Senator","Senator","President",1)
returns the text string President Senator.
The function
SUBSTITUTE("Senator Senator Obama","Senator","President",2)
returns the text string Senator President Obama.
If you leave the instances argument blank, Excel replaces each occurrence of
the old_text with the new_text. For example, the function
SUBSTITUTE("Senator Senator Obama","Senator","President")
returns the text string President President Obama.
The T function
The T function returns its argument if the argument is text. If the argument
isn’t text, the function returns nothing. The function uses the following syntax:
T(value)
For example, the formula T(123) returns nothing because 123 is a value.
The formula T(“Seattle”) returns Seattle because Seattle is a text
string.
The TEXT function
The TEXT function formats a value and then returns the value as text. The
function uses the following syntax:
TEXT(value,format_text)
The value argument is the value that you want formatted and returned as
text. The format_text argument is a text string that shows the currency
symbol and placement, commas, and decimal places that you want. For
example, the formula
=TEXT(1234.5678,"$##,###.00")
returns the text $1,234.57.
Note that the function rounds the value.
The TRIM function
The TRIM function removes extra spaces from the right end of a text string.
The function uses the following syntax:
TRIM(text)
The text argument is the text string or, more likely, a reference to the cell
holding the text string.
The UPPER function
The UPPER function returns an all‐uppercase version of a text string. The
function uses the following syntax:
UPPER(text)
The text argument either supplies the text string that you want to convert
or it references the cell holding the text string. For example, to convert
76 Part I: Getting Started with Data Analysis
the text string professional to PROFESSIONAL, you can use the following
formula:
UPPER(“professional”)
The function returns the text string PROFESSIONAL.
The VALUE function
The VALUE function converts a text string that looks like a value to a value.
The function uses the following syntax:
VALUE(text)
The text argument either supplies the text string that you want to convert
or it references the cell holding the text string. For example, to convert the
text string $123,456.78 — assume that this isn’t a value but a text string —
you can use the following formula:
VALUE(“$123,456.78”)
The function returns the value 123456.78.
Comments
Post a Comment