Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in
Toggle navigation
S
spark_apps
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Service Desk
Milestones
Iterations
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Operations
Operations
Incidents
Environments
Analytics
Analytics
CI / CD
Code Review
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Gurvinder Singh
spark_apps
Commits
c07c3512
Commit
c07c3512
authored
Jul 01, 2014
by
Gurvinder Singh
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
fixed sql app
parent
cef69991
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
7 additions
and
6 deletions
+7
-6
pythonApp/netflowTest.py
pythonApp/netflowTest.py
+1
-1
pythonApp/sqlApp.py
pythonApp/sqlApp.py
+6
-5
No files found.
pythonApp/netflowTest.py
View file @
c07c3512
...
...
@@ -5,7 +5,7 @@ import argparse
DESCRIPTION
=
"Analyze netflow data"
conf
=
SparkConf
()
conf
.
setAppName
(
"Netflow test"
).
set
(
"spark.executor.memory"
,
"1g"
).
set
(
"spark.default.parallelism"
,
15
).
set
(
"spark.mesos.coarse"
,
"
tru
e"
)
conf
.
setAppName
(
"Netflow test"
).
set
(
"spark.executor.memory"
,
"1g"
).
set
(
"spark.default.parallelism"
,
15
).
set
(
"spark.mesos.coarse"
,
"
fals
e"
)
sc
=
SparkContext
(
conf
=
conf
)
...
...
pythonApp/sqlApp.py
View file @
c07c3512
...
...
@@ -3,12 +3,13 @@ from pyspark.sql import SQLContext
sc
=
SparkContext
(
appName
=
'testSQL'
)
sqlCtx
=
SQLContext
(
sc
)
lines
=
sc
.
textFile
(
"hdfs://daas/user/hdfs/trd_gw1_12_01_normalized.csv"
)
#lines = sc.textFile("hdfs://daas/user/hdfs/csv-old/trd_gw1_12_01_normalized.csv")
lines
=
sc
.
textFile
(
"hdfs://daas/spark/test"
)
parts
=
lines
.
map
(
lambda
l
:
l
.
split
(
","
))
records
=
parts
.
map
(
lambda
p
:
{
"
date"
:
p
[
0
],
"src_ip"
:
p
[
1
],
"dest_ip"
:
p
[
2
],
"port"
:
int
(
p
[
3
])})
records
=
parts
.
map
(
lambda
p
:
{
"
text"
:
p
[
0
],
"val"
:
int
(
p
[
1
]),
"val1"
:
int
(
p
[
2
])})
recordsTable
=
sqlCtx
.
inferSchema
(
records
)
recordsTable
.
registerAsTable
(
"records"
)
http
=
sqlCtx
.
sql
(
"SELECT
count(*) FROM records WHERE port <= 80)
"
)
print
(
http
)
http
=
sqlCtx
.
sql
(
"SELECT
text FROM records WHERE val1 >= 1 AND val1 <= 41
"
)
text
=
http
.
map
(
lambda
p
:
"Text: "
+
p
.
text
)
print
(
text
.
collect
()
)
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment