1
- .. -*- rst -*-
2
-
3
1
==============================
4
2
Online Advertising: Ad Serving
5
3
==============================
@@ -35,15 +33,17 @@ The examples that follow use the Python programming language and the
35
33
:api:`PyMongo <python/current>` :term:`driver` for MongoDB, but you
36
34
can implement this system using any language you choose.
37
35
38
- Design 1: Basic Ad Serving
39
- --------------------------
36
+ Serving Basic Ads
37
+ -----------------
40
38
41
- A basic ad serving algorithm consists of the following steps :
39
+ A basic ad serving algorithm consists of the following The :
42
40
43
- #. The network receives a request for an ad, specifying at a minimum the
41
+ #. steps network receives a request for an ad, specifying at a minimum the
44
42
``site_id`` and ``zone_id`` to be served.
43
+
45
44
#. The network consults its inventory of ads available to display and chooses an
46
45
ad based on various business rules.
46
+
47
47
#. The network returns the actual ad to be displayed, possibly recording the
48
48
decision made as well.
49
49
@@ -52,8 +52,8 @@ as well as information stored in the ad inventory collection, to make the ad
52
52
targeting decisions. Later examples will build on this, allowing more advanced ad
53
53
targeting.
54
54
55
- Schema Design
56
- ~~~~~~~~~~~~~
55
+ Schema
56
+ ~~~~~~
57
57
58
58
A very basic schema for storing ads available to be served consists of a single
59
59
collection, ``ad.zone``:
@@ -83,6 +83,9 @@ ads, sorted by their ``ecpm`` values.
83
83
Choosing an Ad to Serve
84
84
~~~~~~~~~~~~~~~~~~~~~~~
85
85
86
+ Querying
87
+ ````````
88
+
86
89
The query you'll use to choose which ad to serve selects a compatible ad and
87
90
sorts by the advertiser's ``ecpm`` bid in order to maximize the ad network's
88
91
profits:
@@ -101,8 +104,8 @@ profits:
101
104
ecpm, ad_group = ecpm_groups.next()
102
105
return choice(list(ad_group))
103
106
104
- Index Support
105
- `````````````
107
+ Indexing
108
+ ````````
106
109
107
110
In order to execute the ad choice with the lowest latency possible, you'll want
108
111
to have a compound index on (``site_id``, ``zone_id``):
@@ -116,6 +119,9 @@ to have a compound index on (``site_id``, ``zone_id``):
116
119
Making an Ad Campaign Inactive
117
120
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
118
121
122
+ Updating
123
+ ````````
124
+
119
125
One case you'll have to deal with in this solution making a campaign
120
126
inactive. This may happen for a variety of reasons. For instance, the campaign
121
127
may have reached its end date or exhausted its budget for the current time
@@ -133,8 +139,8 @@ The update statement above first selects only those ad zones which had avaialabl
133
139
ads from the given ``campaign_id`` and then uses the ``$pull`` modifier to remove
134
140
them from rotation.
135
141
136
- Index Support
137
- `````````````
142
+ Indexing
143
+ ````````
138
144
139
145
In order to execute the multi-update quickly, you should maintain an index on the
140
146
``ads.campaign_id`` field:
@@ -158,8 +164,8 @@ good approach is to shard on the (``site_id``, ``zone_id``) combination:
158
164
... 'key': {'site_id': 1, 'zone_id': 1} })
159
165
{ "collectionsharded": "ad.zone", "ok": 1 }
160
166
161
- Design 2: Adding Frequency Capping
162
- ----------------------------------
167
+ Adding Frequency Capping
168
+ ------------------------
163
169
164
170
One problem with the logic described in Design 1 above is that it will tend to
165
171
display the same ad over and over again until the campaign's budget is
@@ -174,8 +180,8 @@ transmitted to the ad network when logging impressions, clicks, conversions,
174
180
etc., as well as the ad serving decision. This section focuses on how that
175
181
profile data impacts the ad serving decision.
176
182
177
- Schema Design
178
- ~~~~~~~~~~~~~
183
+ Schema
184
+ ~~~~~~
179
185
180
186
In order to use the user profile data, you need to store it. In this case, it's
181
187
stored in a collection ``ad.user``:
@@ -210,17 +216,22 @@ There are a few things to note about the user profile:
210
216
- Profile information is segmented by advertiser. Typically advertising data is
211
217
sensitive competitive infomration that can't be shared among advertisers, so
212
218
this must be kept separate.
219
+
213
220
- All data is embedded in a single profile document. When you need to query this
214
221
data (detailed below), you don't necessarily know which advertiser's ads you'll
215
222
be showing, so it's a good practice to embed all advertisers in a single
216
223
document.
224
+
217
225
- The event information is grouped by event type within an advertiser, and sorted
218
226
by timestamp. This allows rapid lookups of a stream of a particular type of
219
227
event.
220
228
221
229
Choosing an Ad to Serve
222
230
~~~~~~~~~~~~~~~~~~~~~~~
223
231
232
+ Querying
233
+ ````````
234
+
224
235
The query you'll use to choose which ad to serve now needs to iterate through
225
236
ads in order of desireability and select the "best" ad that also satisfies the
226
237
advertiser's targeting rules (in this case, the frequency cap):
@@ -275,8 +286,8 @@ stored in the user profile, from most recent to oldest, within a certain
275
286
appears in the mipression stream, the ad is rejected. Otherwise it is acceptable
276
287
and can be shown to the user.
277
288
278
- Index Support
279
- `````````````
289
+ Indexing
290
+ ````````
280
291
281
292
In order to retrieve the user profile with the lowest latency possible, there
282
293
needs to be an index on the ``_id`` field, which MongoDB supplies by default.
@@ -293,8 +304,8 @@ When sharding the ``ad.user`` collection, choosing the ``_id`` field as a
293
304
... 'key': {'_id': 1 } })
294
305
{ "collectionsharded": "ad.user", "ok": 1 }
295
306
296
- Design 3: Keyword Targeting
297
- ---------------------------
307
+ Keyword Targeting
308
+ -----------------
298
309
299
310
Where frequency capping above is an example of user profile targeting, you may
300
311
also wish to perform content targeting so that the user receives relevant ads for
@@ -303,9 +314,8 @@ at the result of a search query. In this case, a list of ``keywords`` is sent to
303
314
the ``choose_ad()`` call along with the ``site_id``, ``zone_id``, and
304
315
``user_id``.
305
316
306
-
307
- Schema Design
308
- ~~~~~~~~~~~~~
317
+ Schema
318
+ ~~~~~~
309
319
310
320
In order to choose relevant ads, you'll need to expand the ``ad.zone`` collection
311
321
to store keywords for each ad:
0 commit comments