Python application part 1

Data Wrangling with Pandas FINAL-Copy1 You can remove, find index, identify if any element is one list or not etc. But for now, we can't to focus more of Pandas !! There is also a json module in Python for reading complex json files. *************************** Popular functions with re *************************** Here, the match object contains information about the matched string: 1). its span (start and end position in the text), and 2). the match string itself To further extract the details of Match-object use : .group(), .span(), .start(), and .end() Suppose you need the second word with respect to a delimiter, in this case # in below case, when we are looking for b* , we are actually asking to look for zero or more 'b' between 'a' and 'c' so it will find : ac, abc, abbbc, dabc but will NOT find abdc, since 'd' is spoiling the pattern In below example, '^5' means we are looking if the string is starting with '5' In below example, '^5' means we are looking if the string is ending with 'hp' ^ bascially works as a NOT operator If we give {m,n} numbers example {1,4}, it checks all 1,2,3 and 4 instances In below example: \w -> Any alphanumeric, equivalent to [a-zA-Z0-9_] + -> matches one or more occourrence of the previous charecter You should notice that in first email ID, and few other as well, the email IDs also got changed. So let's improve it. so we are now asking for @ followed by alphanumeric The [ns] means the nano second-based time format that specifies the precision of the DateTime object.The [ns] means the nano second-based time format that specifies the precision of the DateTime object. D - Date s - second ms - milli-second us - ns - nano-secondplease note - infer_datetime_format=True takes first non-null values as reference to best guess rest of the values Let's start learning few more basics, which are used quite often during automation#Supoose you want to move the date by n quarters We haven't found a user friendly function to have that yet, so we developed for ourself (inplace = True) : Make the changes in the original DataFrame (inplace = False) : It specifies if the query() method should leave the original DataFrame untouched. This is default .map(lambda x: x - (x*0.91)): This applies a function to each value in the "purchase_value" column. The function subtracts 91% of the value from the value itself, which gives the fee charged on each purchase. The result is a pandas Series containing the fee values. One of the most important thing in merge/join is - how it behaves with many to many condition. let's see: The join() function by default performs a left join, which means that it will keep all the rows of the left DataFrame and fill in missing values with NaN for any missing values in the right DataFrame. Of course, we have must have missed a lot of things, intentionally or unintentionally Please let us know if you want us to cover something, and we will add that here

No comments:

Post a Comment

Do provide us your feedback, it would help us serve your better.