Row Operations

Here you will see a detailed overview of all the row operations available in Optimus. You can access the operations via df.rows

Let’s create a sample dataframe to start working.

words num animals thing second filter
I like fish 1 dog dog housé 5 a
zombies 2 cat tv 6 b
simpsons cat lady 2 frog table 7 1
null 3 eagle glass 8 c

rows.append(row)

Append a row at the end of a dataframe

df.rows.append(["this is a word",2, "this is an animal", "this is a thing", 64, "this is a filter"]).table()
words num animals thing second filter
I like fish 1 dog dog housé 5 a
zombies 2 cat tv 6 b
simpsons cat lady 2 frog table 7 1
null 3 eagle glass 8 c
this is a word 2 this is an animal this is a thing 64 this is a filter

rows.sort()

Sort the columns by rows or multiple conditions.

df.rows.sort("animals").table()
words num animals thing second filter
simpsons cat lady 2 frog table 7 1
null 3 eagle glass 8 c
I like fish 1 dog dog housé 5 a
zombies 2 cat tv 6 b
df.rows.sort("animals", "desc").table()
words num animals thing second filter
simpsons cat lady 2 frog table 7 1
null 3 eagle glass 8 c
I like fish 1 dog dog housé 5 a
zombies 2 cat tv 6 b
df.rows.sort([("animals","desc"),("thing","asc")]).table()
words num animals thing second filter
simpsons cat lady 2 frog table 7 1
null 3 eagle glass 8 c
I like fish 1 dog dog housé 5 a
zombies 2 cat tv 6 b

rows.select(*args, **kwargs)

Alias of Spark filter function. Return rows that match a expression.

df.rows.select(df["num"]==1).table()
words num animals thing second filter
I like fish 1 dog dog housé 5 a

rows.select_by_dtypes(col_name, data_type=None)

This function has built in order to filter some type of row depending of the var type detected by python

words num animals thing second filter
simpsons cat lady 2 frog table 7 1

rows.drop(where=None)

Drop a row depending on a dataframe expression

df.rows.drop((df["num"]==2) | (df["second"]==5)).table()
words num animals thing second filter
null 3 eagle glass 8 c

rows.drop_by_dtypes(col_name, data_type=None)

Drop rows by cell data type

df.rows.drop_by_dtypes("filter", "int").table()
words num animals thing second filter
I like fish 1 dog dog housé 5 a
zombies 2 cat tv 6 b
null 3 eagle glass 8 c

Drop using an abstract UDF

from optimus.functions import abstract_udf as audf

def func_data_type(value, attr):
    return value >1


df.rows.drop(audf("num", func_data_type, "boolean")).table()
words num animals thing second filter
I like fish 1 dog dog housé 5 a