Guoliang Li, Shuo Chen, Jianhua Feng, Kian-lee Tan, Wen-Syan Li
Although influence maximization, which selects a set of users in a social network to maximize the expected number of users influenced by the selected users (called influence spread), has been extensively studied, existing works neglected the fact that the location information can play an important role in influence maximization. Many real-world applications such as location-aware word-of-mouth marketing have location-aware requirement. In this paper we study the location-aware influence maximization problem. One big challenge in location-aware influence maximization is to develop an efficient scheme that offers wide influence spread.
To address this challenge, we propose two greedy algorithms with 1 − 1/e approximation ratio. To meet the instantspeed requirement, we propose two efficient algorithms with ǫ · (1 − 1/e) approximation ratio for any ǫ ∈ (0, 1]. Experimental results on real datasets show our method achieves high performance while keeping large influence spread and significantly outperforms state-of-the-art algorithms.
All in plain text files (unzip first). Must be unzipped under the data/
directory. Because I only use data_name
, not the file path as the input argument. For example, to use gowalla.inf
and gowalla.loc
dataset, only need to say gowalla
in the input argument list. See details below.
the social graph has extension '.inf'
the beginning line is the number of users (user id starts from 1 in this dataset), and the number of bi-directed edges (also half of the total rows in the file minus the first line, since each row represents a directed edge).
the rest lines are splitted by " "
. First two fields are the ids, and the third is the weight from source node to target, and the last vice versa.
[# users] [# bi-directed edges]
[source-id] [target-id] [weight-1] [weight-2]
# example
3 2
1 3 0.00047619047619 0.00444444444444
3 1 0.00444444444444 0.00047619047619
1 5 0.0266666666667 0.0266666666667
5 1 0.0266666666667 0.0266666666667
the social graph has extension '.loc'
the beginning line is the number of rows, also number of users (must be the same with the social graph data. user id starts from 0 in this dataset) with location information.
the rest lines are splitted by " "
.
[# users]
[id] [latitude] [longitude]
# example
5
0 30.262996 -97.750338
1 46.040318 6.6285
2 34.059847 -118.341607
3 37.780893 -122.414839
4 30.286406 -97.733543
Your queries file should have the following format.
The first line is the number of queries (also number of rows in this file minus the first line).
The rest, each line is a query, with following format.
[latitude] [longitude] [width] [height] [k]
width
is represented by number of latitudes.
height
is represented by number of longitudes.
In total they represent a box centering at (latitude, longitude)
, with width
and height
.
IMPORTANT NOTE : make sure the id in location file is 0 based and id in social graph file is 1 based. Also, make sure these are one-to-one mapped. I,E, there can not be a user in the social graph but without location data.
./mip -method [k] [data_name] < [queries]
expansion method
./mip -m 20 gowalla < queries
assembly method
./mip -a 20 gowalla < queries
hint method
./mip -h 20 gowalla < queries
Hint or assembly method may need wait for a while at the beginning, since it will build index on-the-go.
IMPORTANT: you need to create a tmp/
folder in the current directory, that's where output files will be located.
expansion method
outfile named as [data_name]_exp
.
[time] [k] [spread] [# users within query area]
assembly method
outfile named as [data_name]_exp
.
[time] [k] [spread] [# users within query area]
hint method
outfile named as [data_name]_exp
.
[time] [k] [spread] [# users within query area]
For any question about this study, please feel free to contact Guoliang Li.