Okay.
So let me go over my solutions for the assignment.
And please know that these are my solutions.
And that doesn’t mean that it’s the only way to address the requirement.
There are many valid approaches, and as long as you got to the correct outcome, then that’s fine.
Just as long as your method was efficient and reasonable.
Okay, So the first requirement is detailed here.
It was to create an order underscore details table.
So this requires a few things, which I’ll explain.
So first of all, I’ve imported all of the files from the silver layer into the appropriate variables,
and I’ve imported the functions as well.
Okay, so let me run this cell.
So first off, I’ll display the order’s table.
So this contains the order ID the order timestamp which we can convert to a date, the customer ID and
the store name.
So it has a lot of the columns we need for the order underscore details.
Table So let’s start with this table as our base.
So let’s check the data types.
So as you can see, the data type of the order underscore timestamp is a timestamp.
So we can easily use the to underscore date function to convert this to a date.
And I’ve done that in the cell below using the select method.
So here you can see the to date function on the order underscore timestamp column and I’ve alias that
as date.
I noticed that I’ve assigned the results to the variable order, underscore details.
So let me run the cell.
And now let’s display all the underscore details.
So as you can see, we’ve now got a date instead of a time stamp.
And by the way, if you’re unfamiliar with the to underscore date function, please refer to this documentation
here so you can copy this and paste it into your URL.
So we can see that our dataframe is taking shape.
We still need to bring in the total sales value.
So let’s explore the order.
Underscore items DataFrame.
Great.
So from the order underscore items dataframe, we can do the unit price multiplied by the quantity.
That would bring the total sales at a road level.
So let me run this URL below.
Where I’m joining the order underscore items to the order underscore details table via the order ID
of both data frames.
The order details.
Table is on the left side and I’m doing a left join because I want to retain all records in the order
detail table.
And then finally.
I have a select statement here.
Where I am selecting only the required columns from both tables.
And that’s because we don’t need the order ID column from both tables.
So it’s better for ambiguity to just select it from one of the tables.
So let me run this cell and then notice that I’ve assigned it back to the variable order details which
I’ll display now.
So now, as you can see, we’ve got the unit price and the quantity bought in.
So as you can see, it’s really starting to take shape.
So we can do the unit price multiplied by quantity at a row level to calculate the total amount for
each record.
And that’s what I’ve done below here.
As you can see, I’ve used with column and then I’ve done unit price multiplied by quantity, and then
the column name is total sales amount.
So I’ll run that and I’ll display the results.
So now we have the total sales amount at a row level.
So now the final requirement is to group the data frame.
And we should group it by order ID, date customer ID and store name.
And then to get the total amount for each order.
We can simply aggregate it by summing the total sales amount.
So that’s what I’ve done here.
And I should actually change this to total underscore sales.
Underscore that.
And then I’ve done with Colin, renamed where I am renaming the total underscore sales underscore amount
to total underscore order underscore amount.
So let’s run that now and then let’s view our results.
So as you can see, we have the total order amount at a row level.
And the root and the order ID should be unique for each record.
Finally, let’s round this to two decimal places.
Which is what I’ve done here using the round function.
And now I can just quickly display the results.
And now that looks perfect.
So let’s write it to the gold layer.
So note at present we don’t have a gold folder, but I’ll create one using this path.
So I’ve got gold and then order details.
And then the mode should be overwrite in case you want to run the cell multiple times.
So let’s run this.
And that’s done.
So now if I refresh this, we should have a gold folder.
And here is all the details.
Right.
So that was the first requirement.
Another task was to create an aggregated table to show the monthly sales total so we can leverage the
order, underscore detail dataframe that we created here.
Let me just display it.
Okay.
So we can simply extract the month and the year using the date underscore format function.
And again, you can just refer to this link here if you want more information on that function.
And it’s good to extract the year as well because we have data spanning multiple years.
So you don’t want the month to contain information for multiple years.
And we can use the data patterns to specify the format as well.
So as you can see, that’s what I’ve done here.
So I’ve used with column and then date format and I’ve specified the, the format that I’d like to present
the column called month underscore year.
So let me run that and then display the result which I’ve saved into the variable sales underscore with
underscore month.
So let’s display that.
And now, as you can see, we’ve got the month year for each date.
So now all you need to do is group by this month year column and then aggregate the total underscore
order underscore amount.
So that’s what I’ve done here.
And I’ve saved it into a dataframe, into a variable called monthly underscore sales.
So I’ve grouped by the month and fiscal year sum of the total underscore order, underscore amount.
And then I’ve done a with column to.
Create a total sales where I’ve done the round function on the total underscore order underscore amount
to two decimal places.
Finally, I’ve used the salt method to sort it by the month, underscore year descending, and then
I’ve selected the two columns that I’d like.
So let me run this and then display the results.
And as you can see, that meets our requirements.
And then finally, I will write that to a file called Monthly Sales in the gold folder.
So let’s run this.
And now if I refresh this, as you can see.
We’ve got a monthly sales here.
The final task was to create a monthly sales table, but also to split it by the store name.
This is basically the same as the previous ask except to group by the store name as well.
So we can still leverage the sales underscore with underscore month dataframe.
Here, and then we can simply do the same operation as before, except for group by month, year and
store name.
So if I run this and then display the results.
Here’s our requirement.
So we’ve got the month here and then the store name as well with the total sales aggregated.
And then we can write that as a POC file called store underscore monthly underscore sales.
Great.
So now if I refresh that.
He is on file.
Great.
So that was a solution to the assignments.
I hope you found it useful.
And like I said, I’ll be sharing this as a resource with the Course Resources folder so you can review
this in your own time to get additional detail.
And as you can see, I’ve got commentary for each line of code along with additional resources.