Wallet Labeling + Wallet Search
Let me be perfectly clear: trading and investing in Web3 without recognizing the wallets of top performers, and what they are doing, is archaic.
The practices of digital asset institutions today is beyond aggregate graphs, statistics, and TradingView. It is about on-chain analytics down to specific transactions. Do not fail to leverage the permissionless viewability of blockchains, when every competitive player is constantly monitoring wallet and token watchlists for their every move.
Wallet labeling is the process of identifying notable wallets to monitor, or paint a picture of an entity’s token portfolio. Knowing the important wallets likely belonging to top investors and traders, one can set alerts for when assets are being moved, and follow these trades.
For blockchain analytics platforms like Nansen, wallets are labeled algorithmically. Nansen labels are expressive: trader type, high value, smart money, … etc. One wallet can have tens of labels. We aspire to have our wallet labeled as “legendary NFT collector” ( top 0.1% in number of ERC721 token transactions). Cool, indexed categories at nansen.ai can be searched through depending on the trading signal you are looking for.
Addresses can be published during legal proceedings. For example, in our 3AC case study, a set of wallets belonging to 3AC were publicized due to legal proceedings. The user @shogun aggregated the wallet contents into a nice Dune dashboard so anyone can view a lower bound estimate of 3AC’s portfolio.
There are two common problems during the wallet labeling process.
First is that the most interesting labels are probabilistic. If a wallet places a large ERC20 trade ahead of a major project announcement, could that belong to the institutional investors of the project? There is no ground truth to confirm this.
Second is the absence of labels, especially in niche categories. For example, one may start with 2 wallet labels belonging to a fund of interest, but net asset value of these wallets constitute only 10% of the fund’s estimated size.
Can more exploration be done to sketch out the entire portfolio?
We propose The Map. The Map’s search function solves these two shortcomings. Starting with a label that is a maybe, its proximity to other wallets, defined by The Map, grants higher (or lower) confidence. In the case of few niche labels, The Map is a core asset to operators when they explore for new wallets of interest.
Search. Given a query wallet, the search function returns wallets operated similarly. What do we mean by ‘similar’? The returned suggestions share matching token preferences and transaction sizes with the query ( and are found using the latest developments in machine learning). Instead of defining label categories, The Map assigns distance to quantify similarities between wallets.
Search and a couple of labels go far! Use it in one of two methods:
In label verification, a suggested wallet, perhaps from an analyst, is compared with previously labeled wallets to confirm the suggestion. In the above example, The Map assigned higher confidence to the suggested wallet due to its similarity to a known hedge fund X wallet.
What started as the analyst’s guess, is now corroborated by The Map.
Label expansion proposes new wallets. By querying a known, labeled wallet, wallets behaving most similar to the query is returned. Working through the suggestions, the analyst may audit neighboring wallets to confirm or deny labels.
This expands the existing label set by suggestion through behavioral signals.
The 3AC Case Study
A set of wallets belonging to 3AC were publicized following legal proceedings. The user @shogun aggregated the wallet contents into a nice Dune dashboard so anyone can view a lower bound estimate of 3AC’s portfolio.
Executive summary
We applied search to three 3AC wallets and examined each of their top three results (9 results total). By spending 5–10 minutes reviewing each result, we discovered 3 new labels that may be monitored for 3AC trading signals.
Our starting point are three wallets: 0x3BA21b…7E247 is a Curve liquidity provider. 0x71603…f3E9AE holds Compound Ether (cETH) and works with Compound. Etherscan tags 0x4862…388b3 as “Three Arrows Capital”, whose responsibility is to receive USDC, USDT from large exchanges and ferry them to private wallets.
In general, 3AC wallets are heterogeneous in terms of token types held, probably because no two wallets share the same role. If some wallets shared the same responsibility, thus transacting a common set of tokens, we can pursue the method of label verification, where we hold-out one wallet to pretend it as the “suggestion”, and check whether it is 3AC’s using the other labeled wallets.
Instead we will show label expansion: suggest wallets in the wild that resemble the 3AC query wallet to the degree that the suggestion may be an undiscovered 3AC wallet.
Our barebones search UI looks like the following. Result addresses are followed by its distance to the query.
As results are audited, we can scan the set of common tokens transacted by both result and query wallets. In the orange box, we see the query wallet has transacted ether 29 times (first number in parenthesis), and the result wallet has transacted 25 times (second number in parenthesis).
Likewise, counterparties shared between result wallet and query wallet are displayed in the purple box. Note the number of transactions between that counterparty and query (first number in parenthesis), and transactions between counterparty and result (second number in parenthesis).
When a query’s most similar wallet is distant, denoted in green, it was difficult to find anything similar to the query without looking far away; the query is probably exotic. On the other hand, a query’s suggestions may all be close, so it’s in the midst of a large cohort of common wallets.
Blockfi 6: 0x530e0…fe536
Query 0x530e0…fe536 is labeled as blockfi 6. It’s prevalent tokens, from most to least common: USDC, Paxos Gold, BAT, Chainlink, Gemini dollar, UNI.
The USDC, Gemini dollar amounts range from 250,000 to 2.5 million. It periodically interacts with Fund 0xf15…5ae, which is also a frequent counterparty in our third result. Overall, this is a fairly unique address as few wallets work with Paxos Gold.
The three most relevant results are
0x88a7ef…e9d8424 (136.128)
0x0548f…34f558e (143.109)
0x4a4a3…2448a35 (147.036)
0x88a7ef…e9d8424 prevalent tokens, most to least common: Chainlink, USDC, BAT, Uniswap.
While the tokens of interest are a close match, 0x88a7ef…e9d8424 does not transfer Paxos Gold, and its transaction values are less than 10% of 0x530e0…fe536. So this result can be dropped.
0x0548f…34f558e prevalent tokens, most to least common: USDC, USDT, Paxos Gold, Uni, Aave, Dai, Chainlink, and a mix of popular DeFi protocols. This address was already identified as Genesis OTC Desk 2. The USDC, USDT transaction values range from 122,000 to 10 million, at times 20 million. It shares commonality with the query due to its usage of Paxos Gold, large transaction values, and mix of DeFi token transactions.
0x4a4a3…2448a35 prevalent tokens: USDC, USDT, Paxos Gold, Chainlink, Dai, Matic, Uni, BAT. This result, despite being slightly more distant than the previous two (at 147), is the closest match to query after inspection. It transfers large values, is interested in the same set of tokens, and the most notable feature is it interacts with Fund 0xf15…5ae. In practice 0x4a4a3…2448a35 warrants the same label as the query for how closely they operate. Unfortunately, 0x4a4a3…2448a35 was already labeled as one of 3AC’s.
The lesson is new labels are not guaranteed to appear in the returns. But don’t be disheartened by the inevitable duds; results from The Map is becoming more relevant every iteration.
Curve Liquidity Provider 0x3BA21b…7E247
Query 0x3BA21b…7E247 has these prevalent tokens: Curve.fi yDAI+yUSDC+yUSDT+TUSD, USDC, USDT, CRV, Dai, POW BTC-35, True USD. It actively operates DeFi protocols, especially this distinct Curve stable pool, and moves several million dollars in value across its transactions.
0x358189…300d1e (96.296)
0xc43db4…144f7 (107.02)
0x97960…77632 (112.357)
0x358189…300d1e prevalent tokens: USDC, Curve Dao, LUSD, LQTY, USDC, Convex Token, Alchemix, Dai, BarnBridge. This seems another liquidity provider that work with Curve and Liquidity protocols, with a similar average transaction value as the query, i.e. several millions. If the analyst is identifying liquidity providers, this may be labeled.
0xc43db4…144f7 prevalent tokens: Curve Dao, DAI, Lido Dao, stETH, WETH, Curve.fi yDAI+yUSDC+yUSDT+TUSD. Not only does this address contribute to the same stablecoin pool, it started trading YAM around sep 2020, around the time when the query also dabbled with YAM. This address should be analyzed further for labeling.
0x97960…77632 tokens: USDC, Curve DAO, Lido DAO, stETH. This result is a dud, its history suggests a role in Lido staking and Curve operations — not relevant to the query.
The one labeled “Three Arrows Capital” 0x4862…388b3
Prevalent tokens: USDC, USDT, FTT, stETH, Aave, Fantom. It receives tokens from Binance, Bitfinex, OXK, FTX, Huobi, and moves its assets onward to private wallets.
0x82ac…e6b64 (100.081)
0xb5ceb…d1808e (101.556)
0x82ac…e6b64 prevalent tokens: USDC, USDT, FTT, Fantom, Axie Infinity. This address is designated by Etherscan as “Fund: 0x82a…b64”, and interacts with a number of 3AC wallets: BlockFi 6 (covered earlier), Three Arrows Capital: Binance Deposit, Fund: 0x3BA…247. So not only does it resemble Three Arrow Capital’s main wallet in behavior, it directly transacts with 3AC wallets. While Etherscan did name it as a fund, it does not appear among the original 3AC label set given to us, it clearly warrants a label.
0xb5ceb…d1808e prevalent tokens: USDC, USDT, TrueUSD, Aave, Chainlink. Its main usage is to receive stable coins from OKX, Bitfinex, Kraken, FTX, Circle, Binance and transmit them to private wallets. The same main exchanges received by 0x4862…388b3. 0xb5ceb…d1808e plays the same role as the query, and warrants a label.
Debrief
From using label expansion in this case study, the analyst was able to find 3 addresses that can be added to the label set.
Out of the 10 disclosed wallets, we queried 3. The queried wallets were selected because they had very particular traits that, if a result wallet also shared the trait, we can make the case for labeling it citing this ‘smoking gun’.
Where is the search function most challenged?
1) wallets that transfer generic tokens (USDC, USDT) or transfer between general addresses, e.g. Uniswap V3 Router 2. In this case many results are close to the query, but are similarly generic in behavior.
2) Fine distinction in a network of private wallets is needed for relevant results.
Conclusion
The Map is getting better all the time. We hope you are ready for strong trading signals through larger wallet watchlists, a deeply explorable Web3 of addresses, and prime blockchain analytics.
We build Web3 native AI at
For a demo of search using your own queries, send an email to lihan.the.map@gmail.com with the subject title “demo” and up to 5 addresses in the email body, please do not write anything else in the body.