This thread is ongoing, but let me start with the results I have from a year worth of dropped firewall connections.
Let’s jump back a minute. I am importing firewall logs for dropped connections into a MS SQL Database. September as you can see is a fun month with 2,096,766 records.
Since my firewall is a Zyxel device, I gave a look at the .csv delimited log output. Easily enough you can use a Data Import Wizard to spin the logs into some tables. Rough table to log structure is as such:
CREATE TABLE zy_2016-09 ( time VARCHAR(50) NULL, source VARCHAR(50) NULL, destination VARCHAR(50) NULL, priority VARCHAR(50) NULL, category VARCHAR(50) NULL, note VARCHAR(50) NULL, sour_interface VARCHAR(50) NULL, dest_interface VARCHAR(50) NULL, protocol VARCHAR(50) NULL, message VARCHAR(250) NULL, col00 VARCHAR(250) NULL,
I am having fun crawling some output. Typically it’s some sort of fancy OpSec to not say your type of network gear, but this is meant to be informative and hopefully helpful.
So let’s crawl some queries and output in the next post.
I am in the middle of crunching data at the moment. Since we have some records, I am starting with the gigantic table for obvious trends in the dropped connections.
I didn’t have good criteria yet, so I went with ordering the output by Source, in Descending order. Giving the old scroll observation, I saw quite a few connections on port :7759.
2016-09-19 16:21:36 99.98.xx.xx:7759 xxx.xxx.xxx.xxx:7759 notice firewall ACCESS BLOCK wan1 udp Match default rule DROP
I figured I’ll filter the IP result for the moment, with the xxx.xxx address being to my web IP at the time. Yay it was dropped. So that’s a nice log to see.
Jumping back, I started with normal phishing ports but they were not yet as interesting as the following query.
SELECT * FROM [dbo].[zy2016-09] WHERE [source] like '%:7759' ORDER BY [source] desc
975340 rows returned. Out of 2096766 rows. That was a big old flag for just looking into the 99.xx range by a output sort.
I’m going back to reviewing results and trending them. I did skip some details for procuring the logs from your device, setting the firewall up to log to a USB device, and so on. Manually reviewing logs sucks and being able to do queries with COUNT(*) for total, is nice to know what you might be dealing with. Always remember the internet is being port scanned, so trying to log some of it hitting you, let’s you see the weather online.
Jolly Weekend on the upcoming Halloween season. :yar:
Let’s get some Python involved. Manually writing queries is going to be a pain and take a ton of time.
#Dump Queries with start to end port range import sys pstartNum = int(input("Source Port number start: ")) q1 = "SELECT COUNT(*) AS Port_" q2 = "FROM [dbo].[zy2016-09]" qWat = "WHERE [Column 1] like '%:" qClose = "'" for x in range(100): sys.stdout.write("%s%s %s \n" % (q1, pstartNum, q2)); sys.stdout.write("%s%s%s \n" % (qWat, pstartNum, qClose)); print("GO") print() pstartNum = pstartNum - 1 exit
While I’m not a python wizard, this code works. What it does is start from a user-inputted port number and runs a loop 100 times, subtracting 1 for each iteration.
Taking a description of why I am using sys.stdout.write, is to control the spacing on string output, so I have functional SQL queries. If you try to do this with the print() operator, you will get spaces in your output that will break your SQL Query.
I was trying to find a way to handle this in print, but I was fighting against how string interpolation and output control in Python works. Ending this part with what a pair from the 100 loop looks like
SELECT COUNT(*) AS Port_8888 FROM [dbo].[zy2016-09] WHERE [Column 1] like '%:8888' GO SELECT COUNT(*) AS Port_8887 FROM [dbo].[zy2016-09] WHERE [Column 1] like '%:8887' GO
I defined the q variables as parts of the SQL query, instead of kludging together a nasty looking sys.stdout.write line, while keeping it more easy to edit and maintain.
By adding ‘GO’ to the 3rd line, I can have the other queries run instead of not getting results until all 100 queries complete. For some samples of SQL code, check out this MS SQL usage thread I have up here.
Debugging the SQL, here is the breakdown of the output above.
SELECT COUNT(*) AS Port_8887 FROM [dbo].[zy2016-09]
WHERE [Column 1] like ‘%:8887’
SELECT COUNT(*) AS Port_8887
This does a count for the matched records and displays it to your results. If you want to see the record results, replace COUNT(*) with *
Here is the name of the table you imported from your firewall log. In my case, it’s the name I gave to the exported firewall logs.
WHERE [Column 1] like ‘%:8887’
This is the search criteria for the entire table. In this case, :8887 represents the port number for [Columm 1]. Or as I defined in the 1st post source ip address.
Ending with this line, says run these individually, instead of waiting for all 100 of your queries to finish.
Running this in Microsoft SQL Server Management Studio, from a New Query window, you will get similar output to below image.
I am showing 2 example outputs, but the python script is doing 100 iterations.
Jumping back to source Log files.
Depending on your device, check the top of the file. You will likely have some column headers, and in my case I had a line of ‘======’ characters. I find it easier to remove the header and 2nd line, then save the modified file as a new file.
Example log file header for Zyxel firewall:
Time ,Source ,Destination ,Priority ,Category ,Note ,Source Interface ,Destination Interface ,Protocol ,Message
To save hassle on the import, I saved the modified files as zy(Year)-(Month) (IE: zy2016-09) as seen in the SQL code examples. These became my table names when I imported them into my database. By deleting the top two lines, the new file works as a clean csv to import.
What is the python doing?
I read a few books for python and read the online manual, but trying to format string output was driving me crazy. To save you some rage, my observations are that using print(“text”,variable,”rest of query line 1″) will always force a space to be inserted. This will not fly in SQL queries and you don’t want that hassle.
Trying to avoid vomit inducing code and searching around, I found that using stdout.write, lets you define the output spacing, followed by your variable values. It took me vastly longer than I would have liked to have figured this out, but considering generating database scripts was the main goal I had for learning python, I’m pretty excited this works. :)
The running python code is higher in the thread, but below I explain each line, with a # comment under each line.
Line for line breakdown to build the SQL Queries:
import sys # import sys so we can use sys.stdout.write pstartNum = int(input("Source Port number start: ")) # Asking to have the user (You) enter a start number. q1 = "SELECT COUNT(*) AS Port_" # Start of SQL Query, in this case we are getting a count, instead of a display of the values matching the query. q2 = "FROM [dbo].[zy2016-09]" # From portion of a SQL Query qWat = "WHERE [Column 1] like '%:" # Query for your search criteria. qClose = "'" # Close the SQL Query for the like string. for x in range(100): # Run this loop 100 times. sys.stdout.write("%s%s %s \n" % (q1, pstartNum, q2)); # The left side with %s placement, says grab the 1st part of the Select variable, Current iteration of loop, with a space added for the variable for 'FROM (tablename)'. sys.stdout.write("%s%s%s \n" % (qWat, pstartNum, qClose)); # Here the %s triple string placement is to have no spaces. We call the Query, current iteration of the start number from the loop, and close the line with a "'". print("GO") # Execute each query instead of waiting for all of them to complete. print() # prints a blank line. I saw some crazy code to do this by other means, but since all I want is a blank line, this is way easier and fits the bill. pstartNum = pstartNum - 1 # Once the 1st loop finishes, subtract 1 from the starting port number. Do this for each iteration, by sayin that variable is equal to -1 from it's current state. (Please note you will error if you start with a port number less than 100.) exit # stops the 100 loop, so I can copy and paste this into my SQL query window and get results.
Remember this is a breakdown of the source code from the above 3rd post.
If you want this to run for 1000 iterations, change the for x in range(100) to for x in range(1000) or whatever works for you.
If you are not already doing this on your own equipment, you may anger some server admin with resource utilization. Or in the case of using Amazon Web Services or other hosting, your bill might get nasty expensive.
Table structure and Imports revisited
In the case of dealing with Zyxel logs, we are better off leaving the 1st line for headers, but removing the line of ‘=’ as shown in post #5. Once this line is removed, save the .log file and we can import the records into a table using MS SQL Import Wizard. This way, we know everything imported ok, and if it fails, most often it’s because your input column has more characters than your Column defined in the new database table.
In the case of a Zyxel firewall log, this table structure should match the log format and import without error. Here are the columns to a supported character import length:
[Time] varchar(50) [Source] varchar(50) [Destination] varchar(50) [Priority] varchar(50) [Category] varchar(50) [Note] varchar(250) [Source Interface] varchar(50) [Destination Interface] varchar(50) [Protocol] varchar(50) [Message] varchar(450)
There would be some spaces padded into the inserted column names, but at least it would be consistent to what your source data is. You can edit the Design of the table after the import, since changing the column names will not break the data sets. In queries the spaces on the end seem to be ignored, so edit at your preference.
When doing the import, you can click advanced on the Flat File Source portion of the import and in Advanced, you can edit the OutputColumnWidth to match the listed VarChar parameters.