Added a rough sketch of the specification and project structure.
This commit is contained in:
		
							parent
							
								
									8b7e37f30f
								
							
						
					
					
						commit
						b9353eeb24
					
				
					 5 changed files with 79 additions and 1 deletions
				
			
		
							
								
								
									
										38
									
								
								README.md
									
										
									
									
									
								
							
							
						
						
									
										38
									
								
								README.md
									
										
									
									
									
								
							| 
						 | 
				
			
			@ -1,3 +1,39 @@
 | 
			
		|||
# elboob
 | 
			
		||||
 | 
			
		||||
A site search engine for Cryogen with search on the client side
 | 
			
		||||
A site search engine for Cryogen with search on the client side
 | 
			
		||||
 | 
			
		||||
## Design intention
 | 
			
		||||
 | 
			
		||||
This project is intended to be in two parts:
 | 
			
		||||
 | 
			
		||||
### The compiler
 | 
			
		||||
 | 
			
		||||
A Clojure function which scans a list of directories of Markdown files, and produces a map which keys each lexical token occurring in each file (with Markdown formatting, common words, punctuation etc excepted) to a map which keys the relative file path of each file in which the token occurs to the frequency the token occurs within the file.
 | 
			
		||||
 | 
			
		||||
Thus, supposing we had one file, with the path name `content/md/posts/aquarius.md` with the content
 | 
			
		||||
 | 
			
		||||
> # The Age of Aquarius
 | 
			
		||||
>
 | 
			
		||||
> This is the dawning of the Age of Aquarius.
 | 
			
		||||
 | 
			
		||||
Then the output should be
 | 
			
		||||
 | 
			
		||||
``` clojure
 | 
			
		||||
{"age" {"content/md/posts/aquarius.md" 2}
 | 
			
		||||
 "aquarius" {"content/md/posts/aquarius.md" 2}
 | 
			
		||||
 "dawning" {"content/md/posts/aquarius.md" 1}}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
 This map is then stored in a file `elboob.edn` in the root directory of the Cryogen public output. Whether the source path name (e.g. `content/md/posts/`) should be converted to the target pathname (e.g. `/blog/posts-output/`) at compile time or at search time is something I'll decide later.
 | 
			
		||||
 | 
			
		||||
 ### The searcher
 | 
			
		||||
 | 
			
		||||
 The searcher is a little Clojurescript function which, given a sequence of search terms, will read the `elboob.edn` file, will produce a web page showing a list of files which contain one or more of those search terms, ordered by the product of the number of occurences of each word in the file.
 | 
			
		||||
 | 
			
		||||
 ## Implementation
 | 
			
		||||
 | 
			
		||||
 Has not started yet.
 | 
			
		||||
 | 
			
		||||
 ## License
 | 
			
		||||
 | 
			
		||||
 Copyright © 2025 Simon Brooke. Licensed under the GNU General Public License, version 2.0 or (at your option) any later version.
 | 
			
		||||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue