Row Operations
Here you will see a detailed overview of all the row operations available in Optimus.
You can access the operations via df.rows
Let’s create a sample dataframe to start working.
words |
num |
animals |
thing |
second |
filter |
I like fish |
1 |
dog dog |
housé |
5 |
a |
zombies |
2 |
cat |
tv |
6 |
b |
simpsons cat lady |
2 |
frog |
table |
7 |
1 |
null |
3 |
eagle |
glass |
8 |
c |
rows.append(row)
Append a row at the end of a dataframe
df.rows.append(["this is a word",2, "this is an animal", "this is a thing", 64, "this is a filter"]).table()
words |
num |
animals |
thing |
second |
filter |
I like fish |
1 |
dog dog |
housé |
5 |
a |
zombies |
2 |
cat |
tv |
6 |
b |
simpsons cat lady |
2 |
frog |
table |
7 |
1 |
null |
3 |
eagle |
glass |
8 |
c |
this is a word |
2 |
this is an animal |
this is a thing |
64 |
this is a filter |
rows.sort()
Sort the columns by rows or multiple conditions.
df.rows.sort("animals").table()
words |
num |
animals |
thing |
second |
filter |
simpsons cat lady |
2 |
frog |
table |
7 |
1 |
null |
3 |
eagle |
glass |
8 |
c |
I like fish |
1 |
dog dog |
housé |
5 |
a |
zombies |
2 |
cat |
tv |
6 |
b |
df.rows.sort("animals", "desc").table()
words |
num |
animals |
thing |
second |
filter |
simpsons cat lady |
2 |
frog |
table |
7 |
1 |
null |
3 |
eagle |
glass |
8 |
c |
I like fish |
1 |
dog dog |
housé |
5 |
a |
zombies |
2 |
cat |
tv |
6 |
b |
df.rows.sort([("animals","desc"),("thing","asc")]).table()
words |
num |
animals |
thing |
second |
filter |
simpsons cat lady |
2 |
frog |
table |
7 |
1 |
null |
3 |
eagle |
glass |
8 |
c |
I like fish |
1 |
dog dog |
housé |
5 |
a |
zombies |
2 |
cat |
tv |
6 |
b |
rows.select(*args, **kwargs)
Alias of Spark filter function. Return rows that match a expression.
df.rows.select(df["num"]==1).table()
words |
num |
animals |
thing |
second |
filter |
I like fish |
1 |
dog dog |
housé |
5 |
a |
rows.select_by_dtypes(col_name, data_type=None)
This function has built in order to filter some type of row depending of the var type detected by python
words |
num |
animals |
thing |
second |
filter |
simpsons cat lady |
2 |
frog |
table |
7 |
1 |
rows.drop(where=None)
Drop a row depending on a dataframe expression
df.rows.drop((df["num"]==2) | (df["second"]==5)).table()
words |
num |
animals |
thing |
second |
filter |
null |
3 |
eagle |
glass |
8 |
c |
rows.drop_by_dtypes(col_name, data_type=None)
Drop rows by cell data type
df.rows.drop_by_dtypes("filter", "int").table()
words |
num |
animals |
thing |
second |
filter |
I like fish |
1 |
dog dog |
housé |
5 |
a |
zombies |
2 |
cat |
tv |
6 |
b |
null |
3 |
eagle |
glass |
8 |
c |
Drop using an abstract UDF
from optimus.functions import abstract_udf as audf
def func_data_type(value, attr):
return value >1
df.rows.drop(audf("num", func_data_type, "boolean")).table()
words |
num |
animals |
thing |
second |
filter |
I like fish |
1 |
dog dog |
housé |
5 |
a |