Skip to main content

Data Cleaning Functions in Excel

 


The CLEAN function

Using the CLEAN function removes nonprintable characters text. For example,

if the text labels shown in a column are using crazy nonprintable characters

that end up showing as solid blocks or goofy symbols, you can use the

CLEAN function to clean up this text. The cleaned‐up text can be stored in

another column. You can then work with the cleaned text column.


The CLEAN function uses the following syntax:

CLEAN(text)

The text argument is the text string or a reference to the cell holding the text

string that you want to clean. For example, to clean the text stored in Cell A1,

use the following syntax:

CLEAN(A1)


The CONCATENATE function

The CONCATENATE function combines, or joins, chunks of text into a single

text string. The CONCATENATE function uses the following syntax:

CONCATENATE(text1,text2,text3,...)

The text1, text2, text3, and so on arguments are the chunks of text that

you want to combine into a single string. For example, if the city, state, and

ZIP code were stored in fields named city, state, and zip, you could

create a single text string that stores this information by using the following

syntax:

CONCATENATE(city,state,zip)

If city were Redmond, state were WA, and zip were 98052, this function

returns this text string:

RedmondWA98052

The smashed together nature of the concatenated city, state, and ZIP code

information isn’t a typographical mistake, by the way. To concatenate this

information but include spaces, you need to include spaces as function arguments.

For example, the following syntax:

CONCATENATE(“Redmond”, “ ”,“WA”, “ ”,“98052”)

returns the text string

Redmond WA 98052


The EXACT function

The EXACT function compares two text strings. If the two text strings are

exactly the same, the EXACT function returns the logical value for true, which


is 1. If the two text strings differ in any way, the EXACT function returns the

logical value for false, which is 0. The EXACT function is case‐sensitive. For

example, Redmond spelled with a capital R differs from redmond spelled with

a lowercase r.

The EXACT function uses the following syntax:

EXACT(text1,text2)

The text1 and text2 arguments are the text strings that you want to

compare.

For example, to check whether the two strings "Redmond" and

"redmond" are the same, use the following formula:

EXACT("Redmond","redmond")

This function returns the logical value for false, 0, because these two text

strings don’t match exactly. One begins with an uppercase R and the other

begins with a lowercase r.


The FIND function

The FIND function finds the starting character position of one text string

within another text string. For example, if you want to know at what position

within a text string the two‐letter state abbreviation WA starts, you could use

the FIND function.

The FIND function uses the following syntax:

FIND(find_text,within_text,start_num)

The find_text argument is the text that you’re looking for. The within_

text argument identifies where or what you’re searching. The start_num

argument tells Excel at what point within the string it should begin its search.

For example, to find at what point the two‐letter state abbreviation WA begins

in the string Redmond WA 98052, use the following formula:

FIND(“WA”,“Redmond WA 98052”,1)

The function returns the value 9 because WA begins at the ninth position

(because spaces are counted).

The start_num function argument is optional. If you omit this argument,

Excel begins searching at the very beginning of the string.


The FIXED function

The FIXED function rounds a value to specified precision and then converts

the rounded value to text. The function uses the following syntax:

FIXED(number,decimals,no_commas)

The number argument supplies the value that you want to round and convert

to text. The optional decimals argument tells Excel how many places to the

right of the decimal point that you want to round. The optional no_commas

argument needs to be either 1 (if you want commas) or 0 (if you don’t want

commas) in the returned text.

For example, to round to a whole number and convert to text the value

1234.56789, use the following formula:

FIXED(1234.56789,0,1)

The function returns the text 1,235.

The LEFT function

The LEFT function returns a specified number of characters from the left end

of a text string. The function uses the following syntax:

LEFT(text,num_chars)

The text argument either supplies the text string or references the cell holding

the text string. The optional num_chars argument tells Excel how many

characters to grab.

For example, to grab the leftmost seven characters from the text string

Redmond WA, use the following formula:

LEFT(“Redmond WA”,7)

The function returns the text Redmond.

The LEN function

The LEN function counts the number of characters in a text string. The function

uses the following syntax:

LEN(text)


The text argument either supplies the text string that you want to measure

or references the cell holding the text string. For example, to measure the

length of the text string in cell I81, use the following formula:

LEN(I81)

If cell I81 holds the text string Semper fidelis, the function returns the

value 14. Spaces are counted as characters, too.

The LOWER function

The LOWER function returns an all‐lowercase version of a text string. The

function uses the following syntax:

LOWER(text)

The text argument either supplies the text string that you want to convert

or references the cell holding the text string. For example, to convert the text

string PROFESSIONAL to professional, use the following formula:

LOWER(“PROFESSIONAL”)

The function returns professional.

The MID function

The MID function returns a chunk of text in the middle of text string. The

function uses the following syntax:

MID(text,start_num,num_char)

The text argument either supplies the text string from which you grab some

text fragment or it references the cell holding the text string. The start_num

argument tells Excel where the text fragment starts that you want to grab.

The num_char argument tells Excel how long the text fragment is. For example,

to grab the text fragment tac from the text string tic tac toe, use the

following formula:

=MID("tic tac toe",5,3)

The function returns tac.


The PROPER function

The PROPER function capitalizes the first letter in every word in a text string.

The function uses the following syntax:

PROPER(text)

The text argument either supplies the text string or references the cell

holding the text string. For example, to capitalize the initial letters in the text

string ambassador kennedy, use the following formula:

PROPER("ambassador kennedy")

The function returns the text string Ambassador Kennedy.

The REPLACE function

The REPLACE function replaces a portion of a text string. The function uses

the following syntax:

REPLACE(old_text,start_num,num_chars,new_text)

The old_text argument, which is case‐sensitive, either supplies the text

string from which you grab some text fragment or it references the cell holding

the text string. The start_num argument, which is the starting position,

tells Excel where the text starts that you want to replace. The num_chars

argument tells Excel the length of the text fragment (how many characters)

that you want to replace. The new_text argument, also case‐sensitive, tells

Excel what new text you want to use to replace the old text. For example, to

replace the name Chamberlain with the name Churchill in the text string

Mr. Chamberlain, use the following formula:

REPLACE("Mr. Chamberlain",5,11,"Churchill")

The function returns the text string Mr. Churchill.

The REPT function

The REPT function repeats a text string. The function uses the following

syntax:

REPT(text,number_times)

Chapter 3: Scrub‐a‐Dub‐Dub: Cleaning Data 73

The text argument either supplies the text string or references the cell holding

the text string. The number_times argument tells Excel how many times

you want to repeat the text. For example, the following formula:

REPT("Walla",2")

returns the text string WallaWalla.

The RIGHT function

The RIGHT function returns a specified number of characters from the right

end of a text string. The function uses the following syntax:

RIGHT(text,num_chars)

The text argument either supplies the text string that you want to manipulate

or references the cell holding the text string. The num_chars argument

tells Excel how many characters to grab.

For example, to grab the rightmost two characters from the text string

Redmond WA, use the following formula:

RIGHT(“Redmond WA”,2)

The function returns the text WA.

The SEARCH function

The SEARCH function calculates the starting position of a text fragment

within a text string. The function uses the following syntax:

SEARCH(find_text,within_text,start_num)

The find_text argument tells Excel what text fragment you’re looking for.

The within_text argument tells Excel what text string that you want to

search. The start_num argument tells Excel where to start its search. The

start_num argument is optional. If you leave it blank, Excel starts the search

at the beginning of the within_text string.

For example, to identify the position at which the text fragment Churchill

starts in the text string Mr. Churchill, use the following formula:

SEARCH("Churchill","Mr. Churchill",1)

The function returns the value 5.


The SUBSTITUTE function

The SUBSTITUTE function replaces occurrences of text in a text string. The

function uses the following syntax:

SUBSTITUTE(text,old_text,new_text,instances)

The text argument tells Excel what text string you want to edit by replacing

some text fragment. The old_text argument identifies the to‐be‐replaced

text fragment. The new_text supplies the new replacement text.

As an example of how the SUBSTITUTE function works, suppose that you

need to replace the word Senator with the word President in the text

string Senator Obama.

SUBSTITUTE("Senator Obama","Senator","President")

The function returns the text string President Obama.

The instances argument is optional, but you can use it to tell Excel for

which instance of old_text you want to make the substitution. For example,

the function

SUBSTITUTE("Senator Senator","Senator","President",1)

returns the text string President Senator.

The function

SUBSTITUTE("Senator Senator Obama","Senator","President",2)

returns the text string Senator President Obama.

If you leave the instances argument blank, Excel replaces each occurrence of

the old_text with the new_text. For example, the function

SUBSTITUTE("Senator Senator Obama","Senator","President")

returns the text string President President Obama.

The T function

The T function returns its argument if the argument is text. If the argument

isn’t text, the function returns nothing. The function uses the following syntax:

T(value)


For example, the formula T(123) returns nothing because 123 is a value.

The formula T(“Seattle”) returns Seattle because Seattle is a text

string.

The TEXT function

The TEXT function formats a value and then returns the value as text. The

function uses the following syntax:

TEXT(value,format_text)

The value argument is the value that you want formatted and returned as

text. The format_text argument is a text string that shows the currency

symbol and placement, commas, and decimal places that you want. For

example, the formula

=TEXT(1234.5678,"$##,###.00")

returns the text $1,234.57.

Note that the function rounds the value.

The TRIM function

The TRIM function removes extra spaces from the right end of a text string.

The function uses the following syntax:

TRIM(text)

The text argument is the text string or, more likely, a reference to the cell

holding the text string.

The UPPER function

The UPPER function returns an all‐uppercase version of a text string. The

function uses the following syntax:

UPPER(text)

The text argument either supplies the text string that you want to convert

or it references the cell holding the text string. For example, to convert

76 Part I: Getting Started with Data Analysis

the text string professional to PROFESSIONAL, you can use the following

formula:

UPPER(“professional”)

The function returns the text string PROFESSIONAL.

The VALUE function

The VALUE function converts a text string that looks like a value to a value.

The function uses the following syntax:

VALUE(text)

The text argument either supplies the text string that you want to convert

or it references the cell holding the text string. For example, to convert the

text string $123,456.78 — assume that this isn’t a value but a text string —

you can use the following formula:

VALUE(“$123,456.78”)

The function returns the value 123456.78.

Comments

Popular posts from this blog

Charts - Make your data presentable

One-click charts are easy: Select the data and press  Alt+F1 . What if you would rather create bar charts instead of the default clustered column chart? To make your life easier, you can change the default chart type. Store your favorite chart settings in a template and then teach Excel to produce your favorite chart in response to  Alt+F1 . Say that you want to clean up the chart above. All of those zeros on the left axis take up a lot of space without adding value. Double-click those numbers and change Display Units from None to Millions. To move the legend to the top, click the + sign next to the chart, choose the arrow to the right of Legend, and choose Top. Change the color scheme to something that works with your company colors. Right-click the chart and choose Save As Template. Then, give the template a name. (I called mine ClusteredColumn.) Select a chart. In the Design tab of the Ribbon, choose Change Chart Type. Click on the Templates folder to see the template that ...

Data Analysis Tool Pack

  The  Analysis ToolPak  is an  Excel add-in  program that provides data analysis tools for financial, statistical and engineering data analysis. To load the Analysis ToolPak add-in, execute the following steps. 1. On the File tab, click Options. 2. Under Add-ins, select Analysis ToolPak and click on the Go button. 3. Check Analysis ToolPak and click on OK. 4. On the Data tab, in the Analysis group, you can now click on  Data Analysis . The following dialog box below appears. 5. For example, select Histogram and click OK to create a Histogram in Excel. Example Rank and Percentile The Rank and Percentile contained within the Analysis-ToolPak can be quickly used to find the rank of all the values in a list. The advantage of using the Rank and Percentile feature is that the percentile is also added to the output table. The percentile is a percentage that indicates the proportion of the list which is below a given number. Highlight the list (or the cells) which...

20 Power BI Dax Measures

Power bi 20 dax measures 20 DAX (Data Analysis Expressions) measures in Power BI with examples: Total Sales: scss Copy code Total Sales = SUM (Sales[Amount]) Average Sales Price: scss Copy code Avg Sales Price = AVERAGE (Sales[Amount]) Total Units Sold: mathematica Copy code Total Units Sold = SUM ( Sales [ Quantity ] ) Total Customers: scss Copy code Total Customers = COUNTROWS (Customer) Total Products: mathematica Copy code Total Products = COUNTROWS ( Product ) Maximum Sales Amount: scss Copy code Max Sales Amount = MAX (Sales[Amount]) Minimum Sales Amount: scss Copy code Min Sales Amount = MIN (Sales[Amount]) Sales Growth Percentage: mathematica Copy code Sales Growth % = ( Total Sales - [ Total Sales Last Year ] ) / [ Total Sales Last Year ] Total Profit: scss Copy code Total Profit = SUM (Sales[Profit]) Total Orders: scss Copy code Total Orders = COUNTROWS (Orders) Total Customers with Sales: css Copy code Total Customers with Sales = COUNTROWS( FILTER ...