Spaces:
Sleeping
Sleeping
File size: 20,629 Bytes
53e65b7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 |
import requests
from typing import Dict, List
import re
import json
from web_scraper import get_cover_image
class BlogGenerator:
def __init__(self, openai_key: str, openrouter_key: str, serpapi_key: str = None):
self.openai_key = openai_key
self.openrouter_key = openrouter_key
# serpapi_key is now optional since we're using our own image search
def get_cover_image(self, title: str) -> str:
"""Get a cover image URL for the blog post"""
try:
# Use our custom image search function
image_url = get_cover_image(title)
if not image_url:
print("No image found, trying with modified query...")
# Try again with a more generic query if specific title fails
image_url = get_cover_image(title + " high quality cover image")
return image_url
except Exception as e:
print(f"Error in get_cover_image: {e}")
return None
def create_detailed_plan(self, cluster_data: Dict, preliminary_plan: str, research: str) -> str:
try:
response = requests.post(
'https://openrouter.ai/api/v1/chat/completions',
headers={
'Authorization': f'Bearer {self.openrouter_key}',
'HTTP-Referer': 'http://localhost:5001',
'X-Title': 'Blog Generator'
},
json={
'model': 'google/gemini-2.0-flash-thinking-exp:free',
'messages': [{
'role': 'user',
'content': f"""You are part of a team that creates world class blog posts.
For each new blog post project, you are provided with a list of keywords, a primary keyword, search intent, research findings, and a preliminary blog post plan. Here's a definition of each of the inputs:
- Keywords: These are the keywords which the blog post is meant to rank for on SEO. They should be scattered throughout the blog post intelligently to help with SEO.
- Search intent: The search intent recognises the intent of the user when searching up the keyword. Our goal is to optimise the blog post to be highly relevant and valuable to the user, as such the search intent should be satisfied within the blog post.
- Research findings: This is research found from very reputable resources in relation to the blog post. You must intelligently use this research to make your blog post more reputable.
- Preliminary plan: A very basic plan set out by your colleague to kick off the blog post.
- Primary keyword: Out of the keywords, there is one keyword known as the primary keyword. The primary keyword is the keyword which has the highest SEO importance and as such must go in the title and first few sentences of the blog post. It is important that the blog post is highly relevant to the primary keyword, so that it could be placed naturally into the title and introduction sections.
Given the above info, you must create a detailed plan for the blog post.
Your output must:
- Include a plan for the blog post.
- Be in dot point format.
- In each part of the blog post, you must mention which keywords should be placed.
- All keywords must be placed inside the blog post. For each section, mention which keywords to include. The keyword placement must feel natural and must make sense.
- You must include all research points in the blog post. When including the research points, make sure to also include their source URL so that the copywriter can use them as hyperlinks.
- Your plan must satisfy the search intent and revolve directly around the given keywords.
- Your plan must be very detailed.
- Keep in mind the copywriter that will use your plan to write the blog post is not an expert in the topic of the blog post. So you should give them all the detail required so they can just turn it into nicely formatted paragraphs. For example, instead of saying "define X", you must have "define X as ...".
- The plan must have a flow that makes sense.
- Ensure the blog post will be highly detailed and satisfies the most important concepts regarding the topic.
A new project has just come across your desk with the following details:
Keywords: {cluster_data['Keywords']}
Primary keyword: {cluster_data['Primary Keyword']}
Search intent: {cluster_data['Intent']}
Preliminary plan: {preliminary_plan}
Research findings: {research}
Create the detailed plan."""
}]
},
timeout=60
)
if response.status_code != 200:
raise Exception(f"OpenRouter API error: {response.text}")
response_data = response.json()
if 'choices' not in response_data:
raise Exception(f"Unexpected API response format: {response_data}")
return response_data['choices'][0]['message']['content']
except Exception as e:
print(f"Error in create_detailed_plan: {e}")
raise
def write_blog_post(self, detailed_plan: str, cluster_data: Dict) -> str:
try:
response = requests.post(
'https://openrouter.ai/api/v1/chat/completions',
headers={
'Authorization': f'Bearer {self.openrouter_key}',
'HTTP-Referer': 'http://localhost:5001',
'X-Title': 'Blog Generator'
},
json={
'model': 'google/gemini-2.0-flash-thinking-exp:free',
'messages': [{
'role': 'user',
'content': f"""You are part of a team that creates world-class blog posts.
You are the teams best copywriter and are responsible for writing out the actual blog post.
For each new blog post project you are provided with a detailed plan and research findings.
Your job is to create the blog post by closely following the detailed plan.
The blog post you create must:
- Follow the plan bit by bit
- Use short paragraphs
- Use bullet points and subheadings with keywords where appropriate
- Not have any fluff. The content must be value dense and direct
- Be very detailed
- Include the keywords mentioned in each section within that section
- Place the primary keyword in the blog title, H1 header and early in the introduction
- Place one keyword for each section in the heading of that section
- When possible pepper synonyms of the keywords throughout each section
- When possible use Latent Semantic Indexing (LSI) keywords and related terms
- Be at minimum 2000 to 2500 words long
- Be suitable for a year 5 reading level
Make sure to create the entire blog post draft in your first output. Don't stop or cut it short.
Here are the details for your next blog post:
Keywords: {cluster_data['Keywords']}
Primary keyword: {cluster_data['Primary Keyword']}
Search intent: {cluster_data['Intent']}
Detailed plan: {detailed_plan}
Write the blog post."""
}]
},
timeout=120
)
if response.status_code != 200:
raise Exception(f"OpenRouter API error: {response.text}")
response_data = response.json()
if 'choices' not in response_data:
raise Exception(f"Unexpected API response format: {response_data}")
return response_data['choices'][0]['message']['content']
except Exception as e:
print(f"Error in write_blog_post: {e}")
raise
def add_internal_links(self, blog_content: str, previous_posts: List[Dict]) -> str:
try:
response = requests.post(
# 'https://api.openai.com/v1/chat/completions',
# headers={
# 'Authorization': f'Bearer {self.openai_key}',
# 'Content-Type': 'application/json'
# },
# json={
# 'model': 'gpt-4',
'https://openrouter.ai/api/v1/chat/completions',
headers={
'Authorization': f'Bearer {self.openrouter_key}',
'HTTP-Referer': 'http://localhost:5001',
'X-Title': 'Blog Generator'
},
json={
'model': 'google/gemini-2.0-flash-thinking-exp:free',
'messages': [{
'role': 'user',
'content': f"""You are part of a team that creates world class blog posts.
You are in charge of internal linking between blog posts.
For each new blog post that comes across your desk, your job is to look through previously posted blogs and make at least 5 internal links.
To choose the best internal linking opportunities you must:
- Read the previous blog post summaries and look through their keywords. If there is a match where the previous blog post is highly relevant, then this is an internal linking opportunity.
- Do not link if it is not highly relevant. Only make a link if it makes sense and adds value for the reader.
Once you've found the best linking opportunities, you must update the blog post with the internal links. To do this you must:
- Add the link of the previous blog post at the relevant section of the new blog post. Drop the URL at the place which makes most sense. Later we will hyperlink the URL to the word in the blog post which it is placed next to. So your placing is very important.
Make sure to:
- Not delete any existing URLs or change anything about the blog post
- Only add new internal linking URLs
- Place URLs next to relevant anchor text
- Add at least 5 internal links if possible
- Only link when truly relevant and valuable
- Preserve all original content and formatting
Current blog post:
{blog_content}
Previous blog posts:
{json.dumps(previous_posts, indent=2)}
Your output must be the complete blog post with new internal links added. Don't summarize or modify the content - just add the URLs in appropriate places."""
}]
},
timeout=120
)
if response.status_code != 200:
raise Exception(f"OpenAI API error: {response.text}")
response_data = response.json()
if 'choices' not in response_data:
raise Exception(f"Unexpected API response format: {response_data}")
return response_data['choices'][0]['message']['content']
except Exception as e:
print(f"Error in add_internal_links: {e}")
# If there's an error, return the original content
return blog_content
def convert_to_html(self, blog_content: str, image_url: str = None) -> str:
try:
# First replace newlines with <br> tags
formatted_content = blog_content.replace('\n', '<br>')
# Add image URL to the prompt
image_instruction = ""
if image_url:
image_instruction = f"Add this image at the top of the post after the title: {image_url}"
response = requests.post(
# 'https://api.openai.com/v1/chat/completions',
# headers={
# 'Authorization': f'Bearer {self.openai_key}',
# 'Content-Type': 'application/json'
# },
# json={
# 'model': 'anthropic/claude-3.5-sonnet:beta',
'https://openrouter.ai/api/v1/chat/completions',
headers={
'Authorization': f'Bearer {self.openrouter_key}',
'HTTP-Referer': 'http://localhost:5001',
'X-Title': 'Blog Generator'
},
json={
'model': 'google/gemini-2.0-flash-thinking-exp:free',
'messages': [{
'role': 'user',
'content': f"""DO NOT OUTPUT ANYTHING OTHER THAN THE HTML CODE. Follow this layout template to generate WordPress code for a blog post:
The blog post should have:
- Title
{f"- Featured image: <img src='{image_url}' alt='Featured image' style='width: 100%; height: auto; margin: 20px 0;'>" if image_url else ""}
- Estimated reading time
- Key takeaways
- Table of contents
- Body
- FAQ
Rules:
- Make it engaging using italics, dot points, quotes, bold, spaces, and new lines. No emojis.
- Hyperlink any referenced URLs to their adjacent keyphrases
- Wrap content in container <div> with inline CSS for white text (#000000), Arial/sans-serif font, 1.6 line height
- Set non-heading text to 20px and white (#000000) with !important
- Style links, TOC points, and FAQ questions in blue (#00c2ff)
- Add blue (#00c2ff) underline border to headings with padding
- Add double breaks (<br>) between sections
- Output only the HTML code, no extra text
Blog post content:
{formatted_content}"""
}]
},
timeout=120
)
if response.status_code != 200:
raise Exception(f"OpenAI API error: {response.text}")
response_data = response.json()
if 'choices' not in response_data:
raise Exception(f"Unexpected API response format: {response_data}")
blog_content1 = response_data['choices'][0]['message']['content']
formatted_content1 = blog_content1.replace('\n', '')
return formatted_content1
except Exception as e:
print(f"Error in convert_to_html: {e}")
# If there's an error, return the original content
return blog_content
def generate_metadata(self, blog_content: str, primary_keyword: str, cluster_data: Dict) -> Dict:
try:
# Generate slug
slug_response = requests.post(
# 'https://api.openai.com/v1/chat/completions',
# headers={
# 'Authorization': f'Bearer {self.openai_key}',
# 'Content-Type': 'application/json'
# },
# json={
# 'model': 'gpt-4',
'https://openrouter.ai/api/v1/chat/completions',
headers={
'Authorization': f'Bearer {self.openrouter_key}',
'HTTP-Referer': 'http://localhost:5001',
'X-Title': 'Blog Generator'
},
json={
'model': 'google/gemini-2.0-flash-thinking-exp:free',
'messages': [{
'role': 'user',
'content': f"""Create a slug for the following blog post:
{blog_content}
A slug in a blog post is the part of the URL that comes after the domain name and identifies a specific page. It is typically a short, descriptive phrase that summarizes the content of the post, making it easier for users and search engines to understand what the page is about. For example, in the URL www.example.com/intelligent-agents, the slug is intelligent-agents. A good slug is concise, contains relevant keywords, and avoids unnecessary words to improve readability and SEO.
The slug must be 4 or 5 words max and must include the primary keyword of the blog post which is {primary_keyword}.
Your output must be the slug and nothing else so that I can copy and paste your output and put it at the end of my blog post URL to post it right away."""
}]
},
timeout=60
)
if slug_response.status_code != 200:
raise Exception(f"OpenAI API error: {slug_response.text}")
slug_data = slug_response.json()
if 'choices' not in slug_data:
raise Exception(f"Unexpected API response format: {slug_data}")
slug = slug_data['choices'][0]['message']['content'].strip().lower()
# Generate title
title_response = requests.post(
# 'https://api.openai.com/v1/chat/completions',
# headers={
# 'Authorization': f'Bearer {self.openai_key}',
# 'Content-Type': 'application/json'
# },
# json={
# 'model': 'gpt-4',
'https://openrouter.ai/api/v1/chat/completions',
headers={
'Authorization': f'Bearer {self.openrouter_key}',
'HTTP-Referer': 'http://localhost:5001',
'X-Title': 'Blog Generator'
},
json={
'model': 'google/gemini-2.0-flash-thinking-exp:free',
'messages': [{
'role': 'user',
'content': f"""Extract the blog post title from the following blog post:
{blog_content}
The blog post title must include the primary keyword {primary_keyword} and must inform the users right away of what they can expect from reading the blog post.
- Don't put the output in "". The output should just text with no markdown or formatting.
Your output must only be the blog post title and nothing else."""
}]
},
timeout=60
)
if title_response.status_code != 200:
raise Exception(f"OpenAI API error: {title_response.text}")
title_data = title_response.json()
if 'choices' not in title_data:
raise Exception(f"Unexpected API response format: {title_data}")
title = title_data['choices'][0]['message']['content'].strip()
# Generate meta description
meta_desc_response = requests.post(
# 'https://api.openai.com/v1/chat/completions',
# headers={
# 'Authorization': f'Bearer {self.openai_key}',
# 'Content-Type': 'application/json'
# },
# json={
# 'model': 'gpt-4',
'https://openrouter.ai/api/v1/chat/completions',
headers={
'Authorization': f'Bearer {self.openrouter_key}',
'HTTP-Referer': 'http://localhost:5001',
'X-Title': 'Blog Generator'
},
json={
'model': 'google/gemini-2.0-flash-thinking-exp:free',
'messages': [{
'role': 'user',
'content': f"""Create a proper meta description for the following blog post:
{blog_content}
A good meta description for a blog post that is SEO-optimized should:
- Be Concise: Stick to 150-160 characters to ensure the full description displays in search results.
- Include Keywords: Incorporate primary keywords naturally to improve visibility and relevance to search queries.
Primary keyword = {primary_keyword}
More keywords to include if possible = [{cluster_data['Keywords']}]
- Provide Value: Clearly describe what the reader will learn or gain by clicking the link.
- Be Engaging: Use persuasive language, such as action verbs or a question, to encourage clicks.
- Align with Content: Ensure the description accurately reflects the blog post to meet user expectations and reduce bounce rates.
Your output must only be the meta description and nothing else."""
}]
},
timeout=60
)
if meta_desc_response.status_code != 200:
raise Exception(f"OpenAI API error: {meta_desc_response.text}")
meta_desc_data = meta_desc_response.json()
if 'choices' not in meta_desc_data:
raise Exception(f"Unexpected API response format: {meta_desc_data}")
meta_desc = meta_desc_data['choices'][0]['message']['content'].strip()
# Validate the results
if not title or not meta_desc or not slug:
raise Exception("Empty title, meta description, or slug")
return {
'slug': slug,
'title': title,
'meta_description': meta_desc
}
except Exception as e:
print(f"Error in generate_metadata: {e}")
raise |